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EPITHELIAL AMD MOUSE HEPATIC CELLS 
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Ju-Tao Guo 

5 Chris toph Seeger 



This application claims priority to U.S. provisional 
application 60/433,303, filed December 13, 2002, the 
entire contents of which are incorporated by reference 
10 herein . 



GOVERKBXIENT RIGHT 

Pursuant to 35 U.S.C. Section 202(c), it is 
acknowledged that the United States government has 
15 certain rights in the invention described herein, which 
was made in part with funds 5rom the National Institutes 
of Health Grant No. AI48046. 

FIELD OF THE INVENTION 

20 This invention relates to the fields of molecular 

biology and pathology. Novel animal cell lines and non- 
hepatic human epithelial cell lines for the replication 
of hepatitis C virus (HCV) , as well as methods for 
screening for anti-HCV drugs or HCV receptors using these 

25 cell lines are disclosed. Furthermore, adaptive sequence 
mutations in the HCV genome, which permit replication in 
non-human, and non-hepatic cell lines are also provided. 



BACKGROUND OF THE INVENTION 

30 Several publications and patent documents are cited 

in this application in order to more fully describe the 
state of the art to which this invention pertains . The 
disclosure of each of these citations is incoarporated by 
reference herein. 

35 Hepatitis C virus (HCV) is an enveloped, positive 
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stranded KNA virus that belongs to the Flaviviridae, a 
family of viruses including human pathogens such as 
yellow fever virus, dengue virus and West Nile virus (Q. 
L. Choc et al., Science 244, 359-62 (1989)). Although 
5 broad tissue and species tropisras are hallmarks of these 
viruses, HCV replication has so far only been detected in 
human and chimpanzee livers. Moreover, for reasons that 
are not yet londerstood, HCV KNA levels in infected liver 
tissue are extremely low, generally below one copy of RMA 

10 per cell and hence, can only be detected with PGR, making 
it difficult to determine whether secondary sites for 
viral replication exist (J. Boisvert et al., J Infect Dis 
184, 827-35 (Oct 1, 2001); R. E. Lanford, etal., J Virol 
69, 8079-83 (1995) ) . 

15 HCV encodes a- polyprotein that is processed 

proteolytically into ten polypeptides (K. E. Reed, C. M. 
Rice, Curr Top Microbiol Iimunol 242, 55-84 (2000)). 
Three of them are structural proteins required for capsid 
fonnation (core) and assembly into enveloped viral 

20 particles (El and E2) . Four of them are enzymes 

including cysteine and serine proteases (NS2 and NS3), an 
ATP dependent helicase (NS3) and a RNA-directed RNA 
polymerase (NS5B) . The functions of the remaining three 
polypeptides, p7, NS4B, and NS5A, for viral replication 

25 are not yet known. For study of replication of HCV in 
tissue culture cells, the structural proteins can be 
replaced with a selectable marker, such as the neomycin 
phosphotransferase. See for example Figure 2, left panel 
of Lohman et al . (V. Lohmann et al.. Science 285, 110-3 

30 (1999)). Replication of such subgenomic HCV replicons in 
tissue culture cells has so far only been demonstrated in 
the hioman hepatoma cell line Huh7 ,. consistent with the 
narrow host and tissue tropism of HCV infections. 

HCV infection poses a significant public health 

35 problem. Approximately 3% of the world's population has 
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persistent HCV infection. In 1989, the virus was 
identified as the major aetiological agent responsible 
for post-transfusion non-A and non-B hepatitis. 
Following primary HCV infection, persistent viraemia and 
5 chronic hepatitis develop in the majority of cases. 
Efforts to elucidate the mechanisms behind viral 
persistence and hepatocellular damage have been 
frustrated by the lack of a reliable cell culture system 
for viral propagation in vitro. In addition, as the 

10 chiitpazee is the only experimental animal susceptible to 
HCV infection, progress in research is hampered by the 
lack of a small animal model to facilitate 
pathophysiological studies as well as the evaluation of 
antiviral treatment and vaccine strategies. 

15 Furthermore, although the initial HCV infection is 

asymptomatic, subsequent clinical manifestations of HCV 
induced liver disease include fibrosis, cirrhosis, and 
hepatocellular carcinoma (Alter, H. J., and L. B. Seeff. 
2000. Semin. Liver Dis. 20:17-35). Combination antiviral 

20 therapy with alpha interferon (IFN-a) and ribavirin, a 
purine nucleoside analogue, arrests disease progression 
and can lead to sustained recovery in only 45 to 80% of 
treated patients (Di Bisceglie, A. M. , and J. H. 
Hoofnagle. 2002. Hepatology 36:S121-S127) . Additionally, 

25 response to IFN-a therapy can vary significantly 

depending on the viral genotype, ranging from 30 to 40% 
for genotype 1 to as high as 80% for genotypes 2 and 3. 
This suggests that viral deteanninants also play an 
important role in regulating the cellular IFN response 

30 against HCV (Kinzie, J. L., et al., 2001. J. Viral 

Hepatitis 8:264-269; McHutchison, J. G. , et al., 1998. N. 
Engl. J. Med. 339:1485-1492). The parameters deteannining 
the success or failure of antiviral therapy are not 
understood, and their identification represents a major 

35 challenge in HCV biology. 
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Therefore, there is a desperate need for non-hepatic 
cell culture systems, and small animal models for the 
identification and characterization of anti-viral agents 
for the prevention and treatment of HCV infection. 
5 Additionally, there is a need in the art to elucidate the 
mechanism of HCV inhibition by IFN-a, so that other 
treatments may be found. 

smmuvRY OF tbe znventioh 

10 The present invention provides HCV replicating cells 

and cell lines derived from human non-hepatic cells or 
non-human cells. According to one embodiment of the 
invention, the cells are human epithelial cells of non- 
liver origin, such as, HeLa cells. According to another 

15 embodiment of the invention, the cells capable of 

replicating HCV are hepatoma and hepatocyte cells of 
mouse origin, such as, Hepal-6 cells, or AML12 cells 
respectively. 

The present invention also provides a non-hioman host 

20 animal comprising cells infected with HCV. In one 

embodiment of the invention, the host animal is a mouse. 
In another embodiment of the invention, the cells 
infected with HCV are mouse hepatoma cells. 

Also provided by the present invention are methods 

25 for producing human non-hepatic cells or non-human cells 
that are capable of replicating HCV, and cell lines 
comprising the same. Such methods include transfection 
with total HCV RNA or an HCV replicon which comprises one 
or more adaptive mutations which facilitate replication 

30 in a cell of interest. 

The present invention further provides methods for 
screening an agent that modulates HCV replication by 
incubating the agent with the aforementioned cells or 
administering the agent to the aforementioned host animal 

35 comprising cells replicating HCV and assessing said agent 
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for modulation of HCV replication. Such agents may 
inhibit or enhance production of HCV. These agents may 
be cytopathic or non-cytopathic to HCV infected cells. 
Agents which activate aspects of the JAK/STAT pathway may 
5 also be screened using the cells and cell lines of the 
invention . 

Also provided by the present invention are HCV 
derived polynucleotides coinprising adaptive mutations. 
The present inventor has discovered that these mutations 
10 are associated with expanded tropism of HCV. 

Additionally, the present invention provides 
polypeptides encoded by the mutated HCV polynucleotides 
described above. 

15 BRIEF DESCRIPTION OF THE DRAWINGS 

Figures lA and IB are a Northern blot (Fig. lA) and 
micrographs showing replication of HCV subgenomic 
replicons in HeLa cells. (Fig. lA) Detection of HCV 
viral RNA. Total RNA (5 ]ig) was isolated from HeLa cell 

20 lines (SLl, SL3-7) that were established from G418 

resistant cell colonies and analyzed by Northern blot 
analysis. Blots were hybridized with radiolabeled RNA 
probes corresponding to the HCV NS5 region to detect 
viral RNA (vRNA) and the AEl region of human papilloma 

25 virus (HPV) present in HeLa cells. In vitro transcribed 
HCV KKTA (1 ng, + 5iag total RNA from Huh7 cells, lane 7) 
served as a marker (M) and control for the hybridization 
reaction and 28S ribosomal RNA as a control for the 
amount of RNA present in each sanvple analyzed. GS4.1 is a 

30 Huh? derived cell line expressing HCV subgenomic 

replicons. RNA in SLl cells was analyzed from cells 
harvested at the indicated passage (p) . SL3-7 were 
analyzed at p3 (Fig. IB) . IramTonohistochemical analysis 
of HCV replication in HeLa cells. Expression of NS5A in 

35 GS4.1 and SLl cells (p26) was detected with a monoclonal 
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antibody bound to fluorescent isothiocyanate (FITC) - 
conjugated antibody. Parental Huh7 and HeLa cells searved 
as controls. 

Figure 2 diagrams sequence analysis of HCV 

5 sxibgenomes in HeLa and mouse hepatoma Hepal-6 cells. 
(Left section) The physical map of HCV sxibgenomic RNA 
include the positions of the first amino acid NS3 and the 
last residue of the polyprotein. The internal ribosomal 
entry site for the translation of the NS genes is 

10 indicated (EMCV-IEES) . (Right section) Mutations causing 
amino acid changes identified in cDNAs isolated from 
subgenomic replicons present in GS4 . 1 and the indicated 
HeLa (SLl (p26) ; SL2, p5) and Hepal-6 cell lines (MHl 
(pl2) , MH2 (p4) , MH4 (p4) ) are depicted with horizontal 

15 bars. Four independent clones were sequenced from each 
PGR fragment that was amplified from cDNAs obtained from 
total RiSFA purified from the indicated cell lines. 
Mutations present in more than one cell line are 
indicated by amino acid position. Mutations that 

20 occurred in 50% of the clones analyzed are marked with an 
asterisk. Mutations that occurred in only one of four 
clones analyzed were not included in the figure. A 
deletion identified in cDNA clones obtained from SLl 
cells spanning amino acids 2371 to 2413 is indicated (A) . 

25 Figures 3A and 3B are a Northern blot and 

micrographs showing replication of HCV subgenomes in 
mouse hepatoma cells. (A) Detection of HCV RNA. RNA in 
Hepal-6 cell lines (MHl-5) that were established from 
G418 resistant cell colonies and analyzed as described in 

30 the legend to Figure 1 except that a radiolabeled probe 
specific to mouse albumin cDNA was used in lieu of the 
probe against HPV. MH4-5 were analyzed at p4 (B) . 
Immunohistochemical analysis of HCV replication in 
MHl cells (p3) . Expression of NS5A was detected as 

35 described in Figure 1. Hepal-6 cells served as a 
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negative control. 

Figure 4 is a Northern blot showing detection of HCV 
RNA in mouse hepatocyte cells. AML12 cells were 
transfected with total RNA from GS4 . 1 cells, which 
5 e2cpress s-ubgenomic HCV replicons, and HFL cells, which 
express full length HCV genomes. 5 micrograms of total 
laSfA was isolated from AML12 cell lines (MA6-5 to MA6-8, 
and MACl) that were established from G418 resistant cell 
colonies and analyzed by Northern blot analysis. sgENA 
10 indicates subgenomic ENA and flKNA full length genomic 
RNA. 

Figures 5A and 5B are two Northern blots showing the 
antiviral activity of the HCV ENA polymerase inliibitor 
2 ' -C-methyladenosine {2CMA) . GS4.1 (Hiah7) cells (5A) and 

15 SLl (HeLa) cells (SB) were treated with 10 ]M 2CMA. The 
cells were harvested at the indicated time points. Total 
cellular RNA was extracted and viral RNA (vRNA) analyzed 
by Northern blot analysis. 

Figures 6A and 6B are two graphs depicting antiviral 

20 activity of the HCV RNA polymerase inhibitor 5-OH- 

cytidine. GS4 . 1 (Huh?) cells and SLl (HeLa) cells were 
treated with the indicated amounts of 5-OH-cytidine . The 
DNA polymerase inhibitor 5-OH- deoxy -cytidine was used as 
a negative control. The cells were harvested 72 hours 

25 after incubation with the drugs. Total cellular RNA was 
extracted and viral RNA analyzed by Northern blot 
analysis. The intensity of the bands corresponding to HCV 
RNA was determined with a Fuji phosphoimager . 

Figures 7A and 7B are a Northern blot and a graph 

30 showing antiviral activity of IFN-a in Huh 7 (GS4.1) and 
HeLa (SLl) cell lines containing HCV replicons. (7A) 
Viral RNA (vRNA) levels present in GS4.1 and SLl cells 
incubated with 0, 0.1, 0.3, 1, 3, 10, 30, and 100 lU of 
IPN- a /ml (lanes 1 to 9 and 12 to 18) and with 0.01 and 

35 0.03 lU/ml (lanes 10 and 11) for 72 h were determined by 
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Northern blot "analysis. rRNA (28S rRNA) served as a 
control for the amount of ENA loaded per 'lane. (7B) 
Amounts of HCV KNA were determined with a Fuji 
phosphor imager, and the values were plotted as the 
5 percentages of the values obtained with untreated cells 
in lanes 1 and 9 . 

Figures 8A and 8B are micrographs and histograms 
showing that IFN- a induces apoptosis of SLl cells. (8A) 
Annexin V-FITC staining. SLl cells grown on glass 

10 coverslips were left untreated (upper left) or treated 
with 100 lU of IFN- a /ml for 6 h (upper right) or 20 h 
(lower left) or with 100 lU of IFN- a /ml and 20 pM 
caspase inhibitor ZVAD-FMK for 20 h (lower right) . Cells 
were then processed for annexin V FITC staining and 

15 viewed with a fluorescence microscope. (SB) Flow 

cytometry analysis. SLl cells were left untreated (upper 
left) or treated with 100 lU of IFN- a /ml (upper right) 
or with IFN- a and 20 ijM ZVAD-FMK (lower left) for 24 h. 
To inhibit viral replication, SLl cells were incubated at 

20 39°C for 60 h and then treated with 100 lU of IFN- a. /ml 
for 24 h (lower left) . Cells were harvested and processed 
for annexin V-FITC staining and analyzed by flow 
cytometry. The percentages of FITC-positive cells are 
indicated, 

25 Figures 9A and 9B are Northern blots and a graph 

showing a comparison of IFN- a responses against HCV and 
flavivirus Kunjin virus replicons in HeLa cells. (9A) SLl 
and KnNCD20 cells were incubated with 0, 0.01, 0.04, 
0.16, 0.625, 2.5, 10, 40, and 160 lU of IFN- a /ml (lanes 

30 .1 to 9 and 10 to 18, respectively) for 72 h, and viral 

ENA levels were determined by Northern blot analysis with 
a plus-strand-specific KNA probe for the neomycin 
phosphotransferase II gene. Mx-1 mKNA served as a control 
for IFN- a -induced gene expression. p-Actin mKNA served 

35 as a control for the amount of KNA loaded per lane. (9B) 
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The amounts of HCV and Kunjin virus replicon RNA 
(arbitrary tinits) were determined with a Fuji 
phosphoriitiager . 

Figure 10 shows an overview of the interactions 
5 between a virus and the IFN system. Replication of 

viruses in cells produces dsRNA and viral proteins, which 
activate PKR and OAS/KNase L antiviral pathways and also 
signal to the promoter of the IFN-|3 gene by activating 
transcription factors IRF3, NF-kB, and ATF2 . Secreted IFN 

10 binds to its receptor and activates receptor-associated 
Jak kinases, leading to the formation of the trimeric 
transcription factor ISGF3, which binds to the IFN- 
stimulated response element (ISRE) on promoters of IFN- 
stimulated genes. Among the products of the several 

15 hundred genes induced by IFN, PKR, OAS/RNase L, and Mx 
are the best-characterized antiviral proteins, which 
inhibit different stages of viral replication and induce 
apoptosis of virally infected cells. 

Figures llA-llD are Northern blots and two graphs 

20 showing inhibition of the IFN- a response by genistein 
and the V protein of HPIV2 . (IIA) GS4 . 1 cells were 
incxibated with 100 ]ig of genistein/ml for 2 h and then 
with 100 lU/ml IFN-a for an additional 24 h. Viral KNA 
levels were determined by Northern blot analysis. The 

25 cells were harvested at the indicated time points, and 
Mx-A mRNA and viral RNA levels were determined by 
Northern blot analysis. Ribosomal 28S RNA was used as a 
control for the amount of RNA loaded on each lane. (IIB) 
The amounts of viral RNA were measured with a 

30 phosphorimager and plotted as percentages of the values 
obtained with untreated cells. (IIC) GS4.1 cells were 
transfected with pCMV-E3L and pEF-HA-HPIV2 and treated 
with IFN-of at the indicated concentrations for 3 days. 
HCV RNA was sxxbjected to Northern blot analysis. (IID) 

35 Viral ENA levels were determined with a Fuji 
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phosphorimager and plotted as the percentages of the 
values obtained with untreated cells . 

Figures 12A and 12B show the' dsKMA response in 
parental Huh7 and HeLa cells and HCV replicon-containing^ 
5 GS4.1 and SLl cells. (12A) Phosphorylation of eIF-2a. 
Hiih7, GS4.1, HeLa, and SLl cells were left untreated 
(lanes 1, 5, 9, and 13) or treated with 100 lU of IFN-a 
/ml for 12 h (lanes 2, 4, 6, 8, 10, 12, 14, and 16) and 
then transf acted with poly(I:C) and incubated for 3 h 

10 (lanes 3, 4, 7, 8, 11, 12, 15, and 16). eIF-2a -P and 
total eIF-2 a. were determined by Western blots analysis 
with a monoclonal antibody specific for eIF-2o£ -P and an 
antibody specific for total eIF-2a protein. (12B) 
Induction of IFN-|3 mKNA by dsKNA and IFN-a. Parental Huh7 

15 and HeLa cells and HCV replicon- containing 6S4.1 and SLl 
cells were left untreated (lanes 1, 5, 9, and 13) or- 
treated with 100 lU of IFN-a /ml (lanes 3, 4, 7, 8, 11, 
12, 15, and 16) for 12 h and then transf acted with 
poly(I:C) (lanes 2, 4, 6, 8, 10, 12, 14, and 16) for 3 h. 

20 An RNase protection assay was performed with probes 
specific for IFN-p and j3-actin mRNAs . 

Figures 13A-13C show dose-dependent inhibition of 
the IFN-a response against subgenomes by lactacystin and 
epoxomicin. (13A) Cells were incubated with lactacystin 

25 and epoxomicin at the indicated concentrations for 7 h 
and then for an additional 12 h without the drugs . One 
hour after incubation with the proteasome inhibitors, 
IFN-a (100 lU/ml) was added for 6 h to a fraction of the 
cell culture plates (lanes 6 to 10- and 16 to 20). Viral 

30 RNA levels were determined by Northern blot analysis. 

rRNA was used as a control for the amount of RNA present 
in the samples. (13B and 130) The amount of viral RNA was 
measuxed with a phosphorimager, and values were plotted 
as percentages of the values obtained with untreated 

35 cells. 
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Figures 14A and 14B are Northern blots and a graph 
showing that proteasome inhibitors block the IFN-a 
response against HCV replicons. (14A) GS4 . 1 cells were 
left untreated (lanes 1 to 3 and 19 to 21) or treated 
5 with 100 lU of IFN-a /ml for 6 h (lanes 4 to 6 and 22 to 
24), with 5 pM lactacystin (lanes 7 to 9 and 25 to 27) or 
1 ]M epoxomicin (lanes 13 to 15 and 31 to 33) alone for 7 
h, or with 5 ]M lactacystin (lanes 10 to 12 and 28 to 30) 
or 1 pM epoxomicin (lanes 16 to 18 and 34 to 36) alone 

10 for 1 h and then in the presence of 100 lU of IFN-a /ml 
for an additional 6 h. Cells were harvested at 12 h 
(lanes 1 to 18) and 18 h (lanes 19 to 36) after addition 
of the cytokine. Viral RNA levels were determined by 
Northern blot analysis. rRNA was used as a control for 

15 the amount of RNA present in the samples. (14B) The 

amount of viral RNA was measured with a phosphor imager 
and the mean values and standard deviations from three 
samples were plotted. *, P < 0.05; **, P < 0.01. PSL, 
arbitrary units . 

20 Figures 15A and 15B show a Northern blot graph 

demonstrating that proteasome inhibitors prevent 
establishment of an IFN-a response against HCV replicons. 
(ISA) 6S4.1 cells were treated with IFN-a for 10 h, 
followed by treatment with the indicated proteasome 

25 inhibitors for 12 h. Cells were left untreated (lanes 1 
to 3 and 4 to 6) or treated with 100 lU of IFN-a /ml for 
10 h (lanes 7 to 18), followed by treatment with 10 ]M 
lactacystin (lanes 13 to 15) or 1 pM epoxomicin (lanes 16 
to 18) for 12 h. Cells were harvested at 0, 10, and 18 h 

30 after the cytokine treatment, as indicated. Viral RNA 
levels were determined by Northern blot analysis. rRNA 
was used as a control for the amount of RNA present in 
the samples. (15B) The amount of viral RNA was measured 
with a phosphor imager and the mean values and standard 

35 deviations from three samples were plotted. PSL, 
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arbitrary units . 

DETAILED DESCRIPTION OF THE INVENTION 

The hepatitis C virus (HCV) pandemic affects the 
5 health of more than 170 million people and is the major 
indication for orthotopic liver transplantations (OLT) . 
Although the human liver is the primary site for HCV 
replication, it is 'not known whether extrahepatic tissues 
are also infected by the virus and whether non-primate 
10 cells are permissive for RNA replication. However, 

because viral replication leads to the accumulation of 
mutations, it is conceivable that variants can emerge 
with novel properties such as the potential to replicate 
in different cell types of various species. Furthermore, 
15 accumulation of a large number of quasispecies may also 
contribute to resistance to IFN-a treatment. Therefore, 
it is important to determine the properties of HCV 
variants, and the effect such variation has on the 
efficacy of IFN-a therapy. 
20 Provided herein is evidence that subgenomic HCV KNAs 

can replicate in mouse hepatoma and non-hepatic human 
epithelial cells. Moreover, efficient replication 
requires adaptation of the virus to cell-type specific 
environmental conditions. These results show that HCV 
25 KNA replication can lead to the accumulation of mutants 
with altered tissue and host tropism thereby facilitating 
the development of small animal models for HCV infection. 

In accordance with the present invention, there are 
provided nucleic acids and stably-transf ected hxaman non- 
30 hepatic, and murine hepatic cell lines that replicate 

HCV. Also provided are methods of use for such cells for 
identifying therapeutic anti -viral agents for the 
treatment of HCV infection. Additionally, the 
availability of a murine line which replicates HCV 
35 enables the production of a greatly needed mouse model of 
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HCV infection. Furthermore, the invention provides 
polynucleotides and their corresponding polypeptides 
which have adaptive mutations which result in expanded 
tropism of HCV. 

5 The detailed description set forth below describes 

preferred methods for making and using the nucleic acids 
and cell lines of the present invention, and for 
practicing the methods of the invention. Any molecular 
cloning or recombinant DNA techniques not specifically 

10 described are carried out by standard methods, as 

generally set forth, for example, in Sambrook et al., 
"DNA Cloning, A Laboratory Manual," Cold Spring Harbor 
Laboratory, 1989 and Ausubel et al. Current Protocols in 
Molecular Biology, J. Wiley & Sons, 1995. 

15 

I. Definitions 

The following definitions are provided to aid in 
understanding the subject matter regarded as the 
invention. 

20 As used herein, "hepatitis C virus" or "HCV" shall 

mean any representative of a diverse group of related 
viruses classified within the hepacivirus genus of the 
Flaviviridae family. 

"Anti-HCV compoTinds" may include any inhibitor of 

25 HCV-derived enzymes, such as protease, helicase, and 

polymerase inhibitors. Anti-HCV compounds also include 
IRES inhibitors, glycosylation inhibitors, and molecules 
which block the HCV receptor (thus preventing entry into 
cells . ) Other anti-HCV compounds include coittpounds which 

30 enhance the specific or non-specific immune response, 
thereby ameliorating HCV- infection or symptoms . 

"HCV replication levels" may be measured by methods 
known in the art, including but not limited to detection 
of replicated HCV replicons, HCV NS protein production, 

35 or incorporation of detectably labeled nucleotides into 

13 
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an HCV nucleic acid. 

"Nucleic acid" or a "nucleic acid molecule" as used 
. herein refers to any DNA or KNA molecule, either single 
or doTxble stranded and, if single stranded, the molecule 
5 of its complementary sequence in either linear or 

circular form. In discussing nucleic acid molecules, a 
sequence or structure of a particular nucleic acid 
molecule may be described herein according to the normal 
convention of providing the sequence in the 5 ' to 3 ' 

10 direction. With reference to nucleic acids of the 

invention, the term "isolated nucleic acid" is sometimes 
used. This term, when applied to DNA, refers to a DNA 
molecule that is separated from sequences with which it 
is immediately contiguous in the naturally occurring 

15 genome of the organism or virus in which it originated. 
When applied to RNA, the term "isolated nucleic acid" 
refers primarily to an RNA molecule that has been 
sufficiently separated from other nucleic acids with 
which it would be associated in its natural state (i.e., 

20 in cells or tissues) , and explicitly includes viral RNA. 
An isolated nucleic acid (either DNA or RNA) may further 
represent a molecule produced directly by biological or 
synthetic means and separated from other coittponents 
present during its production. 

25 "RNA subgenome" refers to* any molecule which lacks 

some portion of a genome. For example, an KNA subgenome 
can be an HCV RNA molecule in which a structural gene has 
been replaced with a selection agent. 

All amino acid residue sequences represented herein 

30 conform to the conventional left-to-right amino -terminus 
to carboxy-terminus orientation. 

The term "isolated protein" or "isolated and 
purified protein" is sometimes used herein. This term 
refers primarily to a protein produced by expression of 

35 an isolated nucleic acid molecule of the invention. 
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Alternatively, this term may refer to a protein that has 
been sufficiently separated from other proteins with 
which it would naturally be associated, so as to exist in 
"substantially pure" form. "Isolated" is not meant to 
5 exclude artificial or synthetic mixtures with other 
compounds or materials, or the presence of impurities 
that do not interfere with the fundamental activity, and 
that may be present, for exainple, due to incomplete 
purification, addition of stabilizers, or compounding 

10 into, for example, immunogenic preparations or 
pharmaceutically acceptable preparations. 

"Variants", "mutants" and "derivatives" of 
particular sequences of nucleic acids refer to nucleic 
acid sequences that are closely related to a particular 

15 sequence but which may possess, either naturally or by 
design, changes in sequence or structure. By closely 
related, it is meant that at least about 75%, but often, 
more than 90%, of the nucleotides of the sequence match 
over the defined length of the nucleic acid sequence. 

20 Changes or differences in nucleotide sequence between 
closely related nucleic acid sequences may represent 
nucleotide changes in the sequence that arise during the 
course of normal replication or duplication in nature of 
the particular nucleic acid sequence. Other changes may 

25 be specifically designed and introduced into the sequence 
for specific purposes, such as to expand the tropism of 
viral RNA, or to change an amino acid codon or sequence 
in a regulatory region of the nucleic acid. Such 
specific changes may be made in vitro using a variety of 

30 mutagenesis techniques or produced in a host organism 

placed under particular selection conditions that induce 
or select for the changes. Such sequence variants 
generated specifically may be referred to as "mutants" or 
"derivatives" of the original sequence. The terms 

35 "percent similarity", "percent identity" and "percent 



15 



wo 2004/055216 



PCT/US2003/039722 



homology" when referring to a particular sequence are 
used as set forth in the University of Wisconsin GCG 
software program. 

Aa "adaptive mutation" is a mutation in a nucleic 
5 acid sequence which produces a change in viral properties 
or activity. For exaitple, and adaptive mutation 
includes, but is not limited to, a mutation which 
provides enhanced tropism for HCV, or which alters the 
efficacy of IFN-a treatment. 

10 An HCV peptide, polypeptide, or protein of the 

invention includes any analogue, fragment, derivative or 
mutant which is derived from a HCV peptide or polypeptide 
and which retains at least one property or other 
characteristic of the HCV polypeptide. Different 

15 "variants" of the HCV polypeptide exist in nature. These 
variants may be alleles characterized by differences in 
the nucleotide sequences of the gene coding for the 
protein, or may involve different RNA processing or post- 
translational modifications. The skilled person can 

20 produce variants having single or multiple amino acid 
substitutions, deletions, additions or replacements. 
These variants may include inter alia: (a) variants in 
which one or more amino acids residues are substituted 
with conseirvative or non- conservative amino acids, (b) 

25 variants in which one or more amino acids are added to 

the HCV peptide or polypeptide, (c) variants in which one 
or more amino acids include a substituent group, and (d) 
variants in which one or more amino acids are deleted 
from the HCV peptide or polypeptide. Other HCV peptides 

30 or polypeptides of the invention include variants in 
which amino acid residues from one species are 
substituted for the corresponding residue in another 
species, either at the conserved or non- cons earved 
positions. In another embodiment, amino acid residues at 

35 non-conserved positions are substituted with conservative 
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or non- conservative residues . The techniques for 
obtaining these variants, including genetic 
(suppressions, deletions, mutations, etc.), chemical, and 
enzymatic techniques are known to the person having 
5 ordinary skill in the art. 

To the extent such variations, analogues, fragments, 
derivatives, mutants, and modifications, including 
alternative nucleic acid processing forms and alternative 
post-translational modification forms result in 
10 derivatives of the HCV peptide or polypeptide that retain 
any of the biological properties of the HCV peptide or 
polypeptide, they are included within the scope of this 
invention. 

The term "functional" as used herein implies that 
15 the nucleic or amino acid sequence is functional for the 
recited assay or purpose. 

The phrase "consisting essentially of" when 
referring to a particular nucleotide or amino acid means 
a sequence having the properties of a given sequence. 
20 For example, when used in reference to an amino acid 
sequence, the phrase includes the sequence per se and 
molecular modifications that would not affect the 
ftuidamental and novel characteristics of the sequence. 

A "replicon" is any genetic element, for example, a 
25 plasmid, cosmid, bacmid, phage or virus, that is capable 
of replication largely \ander its own control. A replicon 
may be either ENA or DNA and may be single or doiible 
stranded. 

A "vector", is a replicon, such as a plasitdd, cosmid, 
30 bacmid, phage or virus, to which another genetic sequence 
or element (either DNA or RNA) may be attached so as to 
bring about the replication of the attached sequence or 

element . 

The phrase "operably linked" when referring to 
35 nucleic acid constructs is used herein to indicate that 
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the respective promoter, operator and coding seg[uences, 
as well as any other 5' and 3' regulatory sequences, are 
arranged in the appropriate location, order and reading 
frame such that the desired control {e.g., expression) is 
5 effected under appropriate conditions. 

The term "oligonucleotide," as used herein refers to 
primers and probes of the present invention, and is 
defined as a nucleic acid molecule comprised of two or 
more ribo- or deoxyribonucleotides , preferably more than 

10 three. The exact size of the oligonucleotide will depend 
on various factors and on the particular application and 
use of the oligonucleotide. 

The term "probe" as used herein refers to an 
oligonucleotide, polynucleotide or nucleic acid, either 

15 RNA or DNA, whether occurring naturally as in a purified 
restriction enzyme digest or produced synthetically, 
which is capable of annealing with or specifically 
hybridizing to a nucleic acid with sequences 
complementary to the probe. A probe may be either 

20 single- stranded or double- stranded. The exact length of 
the probe will depend upon many factors, including 
temperature, source of probe and use of the method. For 
example, for diagnostic applications, depending on the 
complexity of the target sequence, the oligonucleotide 

25 probe typically contains 15-25 or more nucleotides, 

although it may contain fewer nucleotides. The probes 
herein are selected to be "substantially" complementary 
to different strands of a particular target nucleic acid 
sequence. This means that the probes must be 

30 sufficiently complementary so as to be able to 

"specifically hybridize" or anneal with their respective 
target strands under a set of pre-detentiined conditions. 
Therefore, the probe sequence need not reflect the exact 
complementary sequence of the target. For example, a 

35 non- complementary nucleotide fragment may be attached to 
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the 5' or 3' end of the probe, with the remainder of the 
probe sequence being complementary to the target strand. 
Alternatively, non- complementary bases or longer 
sequences can be interspersed into the probe, provided 
5 that the probe sequence has sufficient complementarity 
with the sequence of the target nucleic acid to anneal 
therewith specifically. 

The term "specifically hybridize" refers to the 
association between two single-stranded nucleic acid 

10 molecules of sufficiently complementary sequence to 

permit such hybridization under pre-determined conditions 
generally used in the art (sometimes termed 
"substantially complementary"). In particular, the term 
refers to hybridization of an oligonucleotide with a 

15 substantially complementary sequence contained within a 
single- stranded DNA or KNA molecule of the invention, to 
the substantial exclusion of hybridization of the 
oligonucleotide with single- stranded nucleic acids of 
non- complementary sequence. For example, hybridizations 

20 may be performed, according to the method of Sainbrook et 
al.. Molecular Cloning , Cold Spring Harbor Laboratory 
(1989) , using a hybridization solution comprising: 5X 
SSC, 5X Denhardt's reagent, 1.0% SDS, 100 jug/ml 
denatured, fragmented salmon sperm DNA, 0.05% sodium 

25 pyrophosphate and up to 50% formamide. Hybridization is 
carried out at 37-42°C for at least six hours. Following 
hybridization, filters are washed as follows: (1) 5 
minutes at room temperature in 2X SSC and 1% SDS; (2) 15 
minutes at room temperature in 2X SSC and 0.1% SDS; (3) 

30 30 minutes-1 hour at 37 °C in IX SSC and 1% SDS; (4) 2 
hours at 42-65 °C in IX SSC and 1% SDS, changing the 
solution every 3 0 minutes. 

One common formula for calculating the stringency 
conditions required to achieve hybridization between 

35 nucleic acid molecules of a specified sequence homology 
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(Sairibrook et al., 1989) is as follows: 

Tm = 81.5°C + 16.6Log [Na+] + 0.41 (% 6+C) -0.63 (% 
formainide) - 600 /#bp in duplex 

5 

AS an illustration of the above formula, using [Na*] 
= [0.368] and 50% formainide, with GC content of 42% and 
an average probe size of 200 bases, the is 57°C. The 
Tb, of a DNA duplex decreases by 1 - 1.5°C with every 1% 

10 decrease in homology. Thus, targets with greater than 
about 75% sequence identity would be obseirved using a 
hybridization temperature of 42°C. 

The stringency of the hybridization and wash depend 
primarily on the salt concentration and temperature of 

15 the solutions. In general, to maximize the rate of 

annealing of the probe with its target, the hybridization 
is usually carried out at salt and temperature conditions 
that are 20-25°C below the calculated of the hybrid. 
Wash conditions should be as stringent as possible for 

20 the degree of identity of the probe for the target. In 

general, wash conditions are selected to be approximately 
12-20°C below the T^ of the hybrid. In regards to the 
nucleic acids of the current invention, a moderate 
stringency hybridization is defined as hybridization in 

25 6X SSC, 5X Denhardt's solution, 0.5% SDS and 100 ixg/ml 
denatured salmon sperm DNA at 42 °C, and washed in 2X SSC 
and 0.5% SDS at 55°C for 15 minutes. A high stringency 
hybridization is defined as hybridization in 6X SSC, 5X 
Denhardt's solution, 0.5% SDS and 100 jUg/ml denatured 

30 salmon sperm DNA at 42 °C, and washed in IX SSC and 0.5% 
SDS at 65°C for 15 minutes. A very high stringency 
hybridization is defined as hybridization in 6X SSC, 5X 
Denhardt's solution,' 0.5% SDS and 100 /ig/ml denatured 
salmon sperm DNA at 42 °C, and washed in O.lX SSC and 0.5% 

35 SDS at 65°C for 15 minutes. 
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The term "primer" as used herein refers to an 
oligonucleotide, either KNA or DNA, either single- 
stranded or do-uble- stranded, either derived from a 
biological system, generated by restriction enzyme 
5 digestion, or produced synthetically which, when placed 
in the proper environment, is able to functionally act as 
an initiator of template-dependent nucleic acid 
synthesis. When presented with an appropriate nucleic 
acid template, suitable nucleoside triphosphate 

10 precursors of nucleic acids, a polymerase enzyme, 

suitable cofactors and conditions such as a suitable 
temperature and pH, the primer may be extended at its 3 ' 
terminus by the addition of nucleotides by the action of 
a polymerase or similar activity to yield a primer 

15 ■ extension product. The primer may vary in length 

depending on the particular conditions and requirement of 
the application. For example, in diagnostic 
applications, the oligonucleotide primer is typically 15- 
25 or more nucleotides in length. The primer must be of 

20 sufficient complementarity to the desired template to 
prime the synthesis of the desired extension product, 
that is, to be able anneal with the desired template 
strand in a manner sufficient to provide the 3 ' hydroxyl 
moiety of the primer in appropriate juxtaposition for use 

25 in the initiation of synthesis by a polymerase or similar 
enzyme. It is not required that the primer sequence 
represent an exact complement of the desired template. 
For example, a non- complementary nucleotide sequence may 
be attached to the 5' end of an otherwise complementary 

30 primer. Alternatively, non- complementary bases may be 

interspersed within the oligonucleotide primer sequence, 
provided that the primer sequence has sufficient 
complementarity with the sequence of the desired template 
strand to functionally provide a template-primer complex 

35 for the synthesis of the extension product. 
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As used herein, the terms "reporter, " "reporter 
system", "reporter gene," or "reporter gene product" 
shall mean an operative genetic system in which a nucleic 
acid comprises a gene that encodes a product that when 
5 expressed produces a reporter signal that is a readily 
measurable, e.g., by biological assay, immunoassay, 
radioimmxmoassay, or by colorimetric , fluorogenic, 
cheini luminescent or other methods. The nucleic acid may 
be either RNA or DNA, linear or circular, single or 

10 double stranded, antisense or sense polarity, and is 

operatively linked to the necessary control elements for 
the expression of the reporter gene product. The 
required control elements will vary according to the 
nature of the reporter system and whether the reporter 

15 gene is in the ,form of DNA or RNA, but may include, but 
not be limited to, such elements as promoters, enhancers, 
translational control sequences, poly A addition signals, 
transcriptional termination signals and the like. 

The terms "transform", "transfect", "transduce", 

20 shall refer to any method or means by which a nucleic 
acid is introduced into a cell or host organism and may 
be used interchangeably to convey the same meaning. Such 
methods include, but are not limited to, electroporation, 
microinjection, PEG- fusion and the like. 

25 The introduced nucleic acid may or may not be 

integrated (covalently linked) into nucleic acid of the 
recipient cell or organism. In bacterial, yeast, plant 
and mammalian cells, for example, the introduced nucleic 
acid may be maintained as an episomal element or 

30 independent replicon such as a plasmid. Alternatively, 
the introduced nucleic acid may become integrated into 
the nucleic acid of the recipient cell or organism and be 
stably maintained in that cell or organism and further 
passed on or inherited to progeny cells or organisms of 

35 the recipient cell or organism. In other applications. 
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the introduced nucleic acid may exist in the recipient 
cell or host organism only transiently. 

A "clone" or "clonal cell population" is a 
population of cells derived from a single cell or common 
5 ancestor by mitosis. 

A "cell line" is a clone of a primary cell or cell 
population that is capable of stable growth in vitro for 
many generations. 

A ^'selectable marker" or a "selection agent" refers 
10 to a nucleic acid sequence that when expressed confers a 
selectable phenotype, such as antibiotic resistance, to a 
transformed cell . 

A "viral antigen" shall be any peptide, polypeptide 
or protein sequence, segment or epitope that is derived 
15 from a virus that has the potential to cause a 

functioning iinmune system of a host to react to said 
viral antigen. 

An "antibody" or "antibody molecule" is any 
immunoglobulin, including antibodies and fragments 
20 thereof, that binds to a specific antigen. The term 

includes polyclonal, monoclonal, chimeric, and bispecific 
antibodies. As used herein, antibody or antibody molecule 
contemplates both an intact immunoglobulin molecule and 
an immunologically active portion of an immxmloglobulin 
25 molecule such as those portions known in the art as Fab, 
Fab', F{ab' )2 and F{v) . 

The term "detectably label" is used herein to refer 
to any substance whose detection or measurement, either 
directly or indirectly, by physical or chemical means, is 
30 indicative of the presence of the target bioentity in the 
test sample. Representative examples of useful 
detectable labels, include, but are not limited to the 
following: molecules or ions directly or indirectly 
detectable based on light absorbance, fluorescence, 
35 reflectance, light scatter, phosphorescence, or 
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luminescence properties; molecules or ions detectable by 
their radioactive properties; molecules or ions 
detectable by their nuclear magnetic resonance or 
paramagnetic properties. Included among the group of 
5 molecules indirectly detectable based on light absorbance 
or fluorescence, for example, are various enzymes which 
cause appropriate substrates to convert , e.g., from non- 
light absorbing to light absorbing molecules, or from 
non- fluorescent to fluorescent molecules. 
10 As used herein, the term "living host" shall mean 

any non- human autonomous being. 

II. Methods for Obtaining HCV RNA and Producing Non- 
Hepatic Hiaman Cell Lines and Non-Htxman Hepatic Cell Lines 
15 that Replicate HCV. 

The HCV replicating non-hepatic human cell-based and 
non-human hepatic cell-based systems are prepared 
according to the general methods set forth below for 
isolation of nucleic acids, transformation of cultured 
20 cells, and maintenance of cell lines. 

A. Nucleic acids 

The HCV replicons of the present invention comprise 
adaptive mutations which alter the ability of HCV to 
25 replicate in different cell types. Surprisingly, the 
present inventors have identified mutations which are 
associated with expanded viral tropism. 

The HCV nucleic acid molecules of the invention may 
be prepared by two general methods: (1) They may be 
30 synthesized from appropriate chemical starting materials, 
or (2) they may be isolated from biological sources. 
Both methods utilize protocols well known in the art. 

The availability of nucleotide sequence information 
enables preparation of an isolated nucleic acid molecule 
35 of the invention by oligonucleotide synthesis. Synthetic 
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oligonucleotides may be prepared by the phosphoramadite 
method employed in. the Applied Biosystems 38A DNA 
Synthesizer or similar devices. The resultant construct 
may be purified according to methods known in the art, 
5 such as high performance liquid chromatography (HPLC) . 
Long, double- stranded polynucleotides, such as a DNA 
molecule of the present invention, must be synthesized in 
stages due to the size limitations inherent in current 
oligonucleotide synthetic methods. Thus, for example, a 

10 3 kilobase double- stranded molecule may be synthesized as 
several smaller segments of appropriate complementarity. 
Complementary segments thus produced may be ligated such 
that each segment possesses appropriate cohesive termini 
for attachment of an adjacent segment. Adjacent segments 

15 may be ligated by annealing cohesive termini in the 

presence of DNA ligase to construct an entire 3 kilobase 
double-stranded molecule. A synthetic DNA molecule so 
constructed may then be cloned and amplified in an 
appropriate vector. 

20 HCV nucleic acid sequences may be isolated from 

appropriate biological sources using methods known in the 
art. For example, total RNA can be extracted with TRIzol 
reagent from Gibco BRL, although other reagents are also 
available for this purpose. 

25 In some cases, it may be desirable to synthesize HCV 

subgenomic RWA wherein a selectable marker gene is 
substituted for a HCV stnactural gene. 

The availability of HCV replicon encoding nucleic 
acids enables the production of strains of laboratory 

30 mice carrying part or all of the HCV sequence or mutated 
sequences thereof. Such mice provide an in vivo model 
for studying HCV infection, and analyzing possible 
treatment modalities for the same. 

Methods of introducing transgenes in laboratory mice 

35 are known to those of skill in the art. Three common 
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methods include: 1. integration of retroviral vectors 
encoding the foreign gene of interest into an early 
enibryo; 2. injection of DNA into the pronucleus of a 
newly fertilized egg; and 3. the incorporation of 

5 genetically manipulated embryonic stem cells into an 
early embryo. 

A transgenic mouse carrying an HCV replicon 
comprising the adaptive mutations is generated by genomic 
integration of exogenous genomic sequence encoding HCV. 

10 These transgenic animals are useful for drug screening 
studies as animal models for human diseases. 

The term "animal" is used herein to include all 
vertebrate animals, except hiomans. It also includes an 
individual animal in all stages of development, including 

15 embryonic and fetal stages. A "transgenic animal" is any 
aniipaal containing one or more cells bearing genetic 
information altered or received, directly or indirectly, 
by deliberate genetic manipulation at the siibcellular 
level, such as by targeted recombination or 

20 itiicroinjection or infection with recombinant virus. The 
term "transgenic animal" is not meant to encompass 
classical cross-breeding or in vitro fertilization, but 
rather is meant to encompass animals in which one or more 
cells are altered by or receive a recombinant DNA 

25 molecule. This molecule may be specifically targeted to 
a defined genetic locus, be randomly integrated within a 
chromosome, or it may be extrachromosomally replicating 
DNA. The term "germ cell line transgenic animal" refers 
to a transgenic animal in which the genetic alteration or 

30 genetic information was introduced into a germ line cell, 
thereby conferring the ability to transfer the genetic 
information to offspring. If such offspring, in fact, 
possess some or all of that alteration or genetic 
information, then they, too, are transgenic animals. 

35 A type of target cell for transgene introduction is 

the embryonal stem cell (ES) . ES cells may be obtained 
from pre-implantation embryos cultured in vitro (Evans et 

26 
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al., (1981) Nature 292:154-156; Bradley et al . , (1984) 
Nature 309:255-258; Gossler et al . , (1986) Proc . Natl. 
Acad. Sci. 83:9065-9069). Transgenes can be efficiently 
introduced into the ES cells by standard techniques such 
5 as DNA trans fection or by retrovirus-mediated 

transduction. The resultant transformed ES cells can 
thereafter be combined with blastocysts from a non-human 
animal. The introduced ES cells thereafter colonize the 
embryo and contribute to the germ line of the resulting 
10 chimeric animal. 

B. Cell lines 

The cell lines of the invention include any cell 
which supports production of HCV components. These cells 

15 include hxunan, non hepatic cells and/ or non-human hepatic 
cells, such as murine hepatic cells. Cell lines useful 
for practice of the invention include, but are not 
limited to HELA, a non-hepatic epithelial cell line (ATCC 
CRL number CCL-2.2), Hepal-6, a murine hepatoma cell line 

20 (ATCC CRL number-1830) , and AML-12, a murine hepatocyte 
cell line (ATCC CRL nuitiber-2254) , 

To achieve stable gene transfer, HCV subgenomic RNA 
is introduced into host cells. This may be accoinplished 
according to numerous methods known in the art, 

25 including, but not limited to: (1) calcium phosphate 
transfection; (2) transfection with DEAE-dextran; (3) 
electroporation; and (4) liposome-mediated transfection. 
For general protocols, see, e.g., chapter 9 in Current 
Protocols in Molecular Biology, Ausubel et al . (editors), 

30 John Wiley & Sons, Inc. 1987-1995. For stable transfer of 
I nucleic acids into mammalian cells, the liposome-mediated 

I transfection method may be used in the present invention 

because of the large amount of nucleic acid that can be 
introduced into the cells, thereby increasing the 

35 possibility of integration of the nucleic acid into the 
host genome. 

27 
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Cells are grown according to standard methods known 
in the art, such as those set forth in Culture of Animal 
Cells: A Manual of Basic Technique by R. Ian Freshney, 
4th ed. Edition, available from the ATCC. 
5 Stable transf ectants are selected by the ability of 

an individual cell colony to grow in the presence of a 
selection agent, e.g., an antibiotic, by virtue of a 
resistance-encoding gene on HCV ENA or by isolating cells 
using FACS and antibodies directed any HCV protein. 

10 Detection and quantitation of expression of HCV gene 

products in stably-transf ected cell lines of the 
invention can be accomplished using a variety of known 
assays. For instance, cells transformed with the RNA 
subgenomes of HCV can be selected with an antibiotic such 

15 as G418 (neomycin) . Alternatively, cells may be selected 
based on the presence and accumulation of HCV RNA or HCV 
gene products. As another example, the starting HCV 
encoding nucleic acids may be modified to also comprise a 
hygromycin or puromycin or any other resistance or 

20 reporter gene, such that cells transf ected with the 

nucleic acids can be selected by their ability to grow on 
hygromycin- or puromycin-containing medium. 
Alternatively, selectable markers including lucif erase, 
beta lactamase etc., may be utilized which allow for the 

25 selection of cells by FACS and related procedures. In an 
alternative embodiment, a separate plasmid may be 
constructed that comprises an antibiotic resistance gene, 
and can be used to co-transfect cells along with the 
subgenomic KNA molecules. Further, as described in 

30 detail in the following Examples, cells stably 

transfected with the subgenomic HCV RNA are grown in the 
appropriate medium for a selected period of time, the 
medium is then collected and analyzed for the presence of 
HCV KNA by dot blot hybridization or by conventional 

35 Northern hybridization, using a radioactively labeled 
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probe having HCV DNA or RNA coitiplementary sequences. 
Alternatively, viral gene products may be detected in the 
cells of the invention using conventional methods, 
including, without limitation, iitimunoassay and Western 
5 blotting. 

Using the assays described above, stably- trans fee ted 
cell lines can be selected which possess optimum 
characteristics for use in cell-based assays for 
screening potential anti-viral compounds. 

10 Another aspect of the invention includes a non-human 

host animal which comprises the HCV eaqpressing cells of 
the invention. These animals may be produced by 
administration of a HCV replicating cell, an HCV encoding 
nucleic acid having one or more adaptive mutations which 

15 permit replication in mice. The cells or viral nucleic 
acid could be directly injected intravenously (e.g. via 
tail vein injection) , intramuscularly, subcutaneous ly, or 
via- intrahepatic injection. Alternatively, transgenic 
mice could be produced using the HCV replicons of the 

20 inventions, as described above. 

111. Uses of Cell Lines for Cell-Based Assays of 
Potential JRnti-HCV Agents 

The human non-hepatic and murine hepatic cell lines 
25 of the invention which replicate HCV may be used in 
research, diagnostic, and therapeutic applications, 
including cell-based assays to evaluate the effectiveness 
of potential anti-HCV compounds, utilizing methodologies 
known in the art. Typical assays are summarized herein 
30 below. These cell-based assays may be performed in 

standard cell culture media utilizing commonly-available 
equipment, reagents and culture containers. 

Persons skilled in the art will appreciate that 
these assays represent exemplary embodiments, and may be 
35 varied to provide similar/equivalent equipment or 
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reaction conditions. For example, a variety of genes 
encoding antibiotic resistance are available, and can be 
utilized in accordance with the present invention in the 
generation of the cell lines of the invention. In a 
^ 5 preferred embodiment, RWA isolated from parental human 
hepatic or un trans formed cells is also utilized as a 
control in the assays described herein below to determine 
the effects of potential anti-viral compounds on HCV 
expressed in the cells. The control RNA is obtained in a 

10 manner similar to . the HCV RNA. This cell line is treated 
in the assays described herein below as a negative 
control, to assure that any effects observed are due to 
the action of the compound being tested on HCV, and not 
non-specific effects due to the introduction of KNA into 

15 the cells. 

A. General Cell-Based Assay for Iiihibitors of HCV 
replication 

96-well microti ter plates are seeded with an 

20 appropriate amount of cells which replicate HCV in a 

standard cell culture medium containing 6418 (e.g., 400 
Jig/ml) , as well as standard concentrations of penicillin, 
streptomycin and kanamycin or gentamicin to prevent 
bacterial and mycoplasma contamination. The cells are 

25 incubated at 37 °C in a humidified 5% CO2 incubator. On day 
0 wells are washed three times with warm phosphate- 
buffered saline (PBS) . The culture meditom is then 
replaced with fresh medium containing 0.3% 
dimethyl sulfoxide (DMSO) , 10% fetal calf serum (PCS) , 

30 penicillin, streptomycin, kanamycin /gentamicin, 

containing one of the following ingredients: (1) various 
concentrations of a known HCV inhibitor, such as 
interferon alpha, as a positive control; and (2) various 
concentrations of one or more of the compounds to be 

35 tested. The plates are incubated at 37°C in hiomidified. 
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5% CO2 incubator for 24, 48, and 72 hours. The plates 
are washed twice with PBS and then with a solution of 
methanol and acetone (1:1) to fix the cells. The cells 
are then incubated with an antibody specific for a viral 
5 protein (i.e. NS5A) according to the standard methods, 
such as enzyme linked immunosorbent assay (ELISA) . 
Briefly, following incubation with the primary antibody, 
the plates are washed to remove unbound antibody and then 
incubated with a second, enzyme-conjugated antibody that 

10 can bind to the primary antibody. The plates are washed 
again, followed by an incubation with a colorless 
substrate that upon hydrolysis (cleavage) by the enzyme 
yields a colored product, the concentration of which can 
be determined with a spectrophotometer (microtiter plate 

15 reader) . The concentration of the product corresponds to 
the levels of viral replication in cells and can be used 
to determine the activity of a given drug to inhibit HCV 
replication. 

20 B. Cytotoxicity Assays 

A cytotoxicity assay may be conducted to evaluate 
potential anti-HCV agents, utilizing a protocol similar 
to that described above'. Instead of measuring HCV 
replication levels, however, cytoxicity of the various 

25 test agents is assessed by standard procedures to 

determine cell viability, proliferation and levels of 
cellular metabolism including but not restricted to cell 
membrane permeability, lysosomal mass-pH, cell density or 
mitochondrial activity. For example, the CytoTox-ONE™ 

30 Assay from Promega is a rapid, fluorescent measure of the 
release of lactate dehydrogenase (LDH) from cells with a 
damaged membrane. LDH released into the culture medium 
is measured with a 10-minute coupled enzymatic assay that 
results in the conversion of resazurin into resorufin. 

35 Since the CytoTox-ONE™ Reagent mix does not damage 
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healthy cells, released LDH can be measured directly in 
assay wells containing a mixed population of viable and 
damaged cells . 

5 IV. Identification of Cell Lines Permissive for HCV 
Infection 

As shown herein, it is possible to produce HCV 
carrying adaptive mutations that confer broad tissue and 
species tropism. Using such virus stocks it will be 

10 possible to screen cell lines of human and non-human 
origin for virus infection. Briefly, probes which 
correspond to unique portions of the sequence, may be 
used in detection methods. This method will lead to the 
identification of novel cell lines that are permissive 

15 for a complete cycle of HCV replication. 

V. Screening for the HCV Receptor (s) 

The ability to replicate HCV in different cell lines 
facilitates the isolation of the HCV receptor. Virus 

20 stocks similar to the ones described in section IV can be 
used to isolate the HCV receptor (s) . For this purpose 
virus stocks carrying replicons with a selectable marker, 
such as neomycin or hygromycin will be used. Cells that 
are non-permissive for infection, will be transfected 

25 with DNA isolated from cells that are known to express 
the receptor (i.e. h-uman hepatocytes, cells identified 
with the procedure described in section III) and 
subsequently infected with recombinant HCV carrying the 
selectable marker. Cells that e3cpress the receptor can 

30 then be selected through the addition of an antibiotic 
(i.e. G418 or hygromycin) to the culture medixom. Once 
cells are identified, the transfected DNA can be 
isolated, cloned, and sequenced. The sequence 
information can then be used to identify the gene{s) 

35 encoded by transfected DNA. 



wo 2004/055216 



PCT/US2003/039722 



The following examples are provided to describe the 
invention in further detail. These examples are intended 
to illustrate and not to limit the invention. 

5 

EXaMPLE I Etonan Mon-Hepatic and Mouse Hepatic Cell Lines 
that replicate HCV 

MATERIALS AND METHODS 

10 Cell culture. Cells were purchased from the American Type 
Culture Collection (BHK Kidney Mesocricetus auratus 
(Syrian golden hamster) ATCC CRL-1632; Vero Kidney 
epithelial Cercopithecus aethlops (African green monkey) 
■ ATCC CCL-81; CV-1 Kidney fibroblast Cercopithecus 

15 aethiops (African green monkey) ATCC CCL-70; HT1080 
Fibrosarcoma Homo sapiens (human) ATCC CRL12012; HeLa 
Cervix carcinoma Homo sapiens (human) ATCC CCL2; McA- 
RH7777 Hepatoma Rattus norvegicus (rat) ATCC CRL-1601; 
FT02B Hepatoma Rattus norvegicus (rat) ; Hepal-6 Hepatoma 

20 Mas musculus (mouse) ATCC CRL-1830; AML12 Hepatocyte Mus 
musculus (mouse) ATCC CRL-2254; FL83B Hepatocyte Mus 
musculus (mouse) ATCC CRL-2390) . The Huh7-derived cell 
lines 6S4.1 and GS4.5 are subclones derived from cell 
lines FCAl and FCA4, respectively (Guo, J. T., et al . , 

25 2001., J. Virol. 75:8516-8523). Cell line Bsp8 is a Huh7- 
derived cell line expressing HCV-N subgenomic replicon 
IbneoAS (Guo, J. T. , et al., 2001., J. Virol. 75:8516- 
8523) . All cultures were grown in Dulbecco's modified 
Eagle's medium (Gibco-Invitrogen) supplemented with 10% 

30 fetal bovine serum, L-glutamine, nonessential amino 
acids, penicillin, and streptomycin. 

UNA transfection. All the plasmids were linearized with 
Seal, and KMA was synthesized with the MEGAscript kit 
35 (Ambion) . In vitro-transcribed RNA was purified as 

33 
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previously described (Guo, J. T. , et al., 2001., J. 
Virol. 75:8516-8523). Total cellular RNA was extracted 
with Trizol reagent (Invitrogen) . The conditions used for 
the transfection of cells with total KMA were identical 
5 to those used for the transfection with in vitro- 

transcribed RNA {Guo, J. T., et al., 2001., J. Virol. 
75:8516-8523). Colonies were selected with G418 at a 
concentration of 1 mg/itil. 

10 RNA analysis. Total cellular RWA was extracted from 
transfected cell lines with Trizol reagent. Five 
micrograms of total RNA was fractionated on 1% agarose 
gels containing 2 . 2 M formaldehyde and transferred onto a 
nylon membrane. Membranes were hybridized with riboprobes 

15 specific for plus-stranded HCV replicon KNA, human 
papillomavirus (HPV) E6, and mouse albumin mRMA as 
described previously (Guo, J. T. , et al., 2001., J. 
Virol. 75:8515-8523) . The HPV and mouse albumin probes 
spanned nucleotides 811 to 1491 (GenBank accession number 

20' M20325) and nucleotides 1501 to 1988 (GenBank accession 
number XM_132 149) , respectively. 

Reverse transcription- PGR and DNA sequencing. Nucleotide 
and amino acid numbers correspond to the HCV type lb 

25 genome Con-1 (AJ238799) . HCV replicons were isolated and 
cloned from established cell lines by PGR amplification 
of three fragments spanning the entire NS region from 
position 3420 to 9410. The untranslated regions at the 5' 
and 3' ends of HCV RNA were cloned separately for 

30 nucleotide sequence analysis. DNA synthesis was carried 
out with Superscript II reverse transcriptase provided in 
a cDNA synthesis kit (Gibco- Invitrogen) . The DNA 
oligomers used as primers for the reverse transcription 
reaction mapped to positions 485 to 465, 5492 to 5473, 

35 7256 to 7234, 9410 to 9388, and 9616 to 9597. The 
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reaction mixtures were incubated for 1 h at 45 °C. PCE. was 
performed with an Advantage PGR kit (Clontech) . One 
microliter of the cDWA reaction mixture was used for PCRs 
with 19- to 23-nucleotide-long primers that yielded 
5 fragments spanning positions 1 to 464, 1387E to 5082, 
5016 to 7226, 7154 to 9387, and 9239 to 9616. Position 
1387E refers to an oligomer specific for the 
. encephaloinyocarditis virus (EMCV) internal ribosome entry- 
site (IKES) element located upstream of NS3. The PGR 

10 products were cloned into plasmid pGEM-T Easy (Promega) . 
Four clones of each fragment were sequenced with an ABI 
automatic DNA sequencer, and a consensus sequence was 
established with the help of a sequence assembly program 
(Genetics Gomputer Group) . 

15 Long reverse transcription-PCR was performed with an 

Advantage-GC kit (Clontech) with a pair of primers 
beginning at positions 1415E, upstream of NS3 , and 7989 
within NS5B. The PGR conditions were modified as follows: 
step 1, 95 °C for 3 min; step 2, 5 cycles, 30 s at 95 °G 

20 and 6 min at 72°C; step 3, 27 cycles, 30 s at 95°G and 6 
min at 68 °G; step 4, 68 °G for 6 min. PGR products were 
gel purified and digested with Hindlll and Mfel and 
replaced with the corresponding fragment in plasmid 
I377/NS3-3' . 

25 

Plasmid construction. All plasmids (Table 3) were derived 
from the parental HCV Con-1 replicon I377/NS3-3' 
(AJ242652) . Subgenomes containing consensus mutations 
were constructed by replacing DNA restriction fragments 
30 with the corresponding fragments from the pGEM-T Easy 
cDNA libraries (see above) . The resulting plasmids with 
the amino acid changes in the NS region are listed in 
Table 3 . 

35 Immunofluorescence . Gells were plated on covers lips in 
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six-well plates at least 16 h before treatment, washed 
with phosphate-buffered saline, and fixed with cold 
methanol-acetone (1:1) for 15 to 20 min. Next, the cells 
were blocked in phosphate-buffered saline containing 10% 
5 fetal bovine serxom for 30 min at room temperature and 
then incubated with anti-NS5A antibodies (a gift from 
Chen Liu) and fluorescein isothiocyanate-conjugated goat 
anti-mouse immunoglobulin antibodies (Jackson 
Laboratories) . In addition, cells were stained with the 
10 DNA binding fluorochrome DAPI (4' , 6'-diactiidino-2- 

phenylindole) . Coverslips were mounted with antifade 
agent (Molecular Probes) , examined with a Nikon 
immunofluorescence microscope, and photographed with a 
charge-coupled device camera. 

15 

RKSULTS 

HCV replication in cells of nonhepatic origin. 

HCV exhibits a very narrow host range and infects 
only humans and chinpanzees . We question whether this 

20 limitation was due to determinants of RNA replication. 
Because efficient replication of subgenomes depends on 
genetic adaptations of the replicon (Blight, K. J., et 
al. 2000. Science 290:1972-1975; Guo, J. T., et al . , 
2001., J. Virol. 75:8516-8523; Lohmann, V., et al., 2001. 

25 J. Virol. 75:1437-1449), presumably to compensate for 
subtle variations in the cellular environments among 
cells from different tissues, it was hypothesized that 
replication in cells of nonhepatic origin would require 
additional, cell-type-specif ic adaptive mutations. 

30 Transfection of several primate- and rodent-derived cell 
lines with subgenomic RNA transcribed from plasmid DNA 
carrying previously identified adaptive mutations in Huh? 
cells did not yield cell lines expressing replicons . To 
increase the chance for the selection of ENA subgenomes 

35 capable of replicating in cells of nonhepatic origin, 
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subgenomic RNA isolated from HuhV cell lines that 
replicate HCV RNA was used. Because of the high rate of 
nucleotide incorporation errors that occur during RNA- 
directed RNA synthesis, this population of viral 
5 subgenomes exhibited much greater genetic heterogeneity 
than did RNA transcribed from a DNA template in vitro. 

Upon transfection of HeLa cells with total RNA 
obtained from Huh7 cell lines GS4.1, GS4.5, and Bsp8, 
G418 -resistant cell clones were obtained. The number of 
10 clones ranged from approximately 2 (Bsp8) to 50 {GS4.1) 
per 10 ]ig of total RNA depending on the origin of the RNA 
used for the transf ections . Replicons in these three 
Huh? -derived cell lines contained different adaptive 
mutations and replicated two different HCV lb genomes 
15 (Guo, J. T., et al., 2001., J. Virol. 75:8516-8523). 
Several HeLa-derived colonies obtained with total RNA 
from GS4.1 cells were siibsequently expanded into seven 
stable cell lines {SLl to SL7 ; Fig. lA, lanes 4 and 8 to 
12) . The amoiHits of viral RNA present in early passages 
20 of these cell lines examined ranged from 0.05 to 7.5 

ng/10 lag of total RNA, which corresponded to 20 to 3,000 
copies of RNA per cell. In general, the amoionts of RNA 
increased upon passage of cells and reached levels that 
were comparable to those obtained with the most 
25 productive Huh7- derived cell lines such as GS4.1 (lanes 
2 and 4 to 6) . As expected, expression of viral gene 
products could be confirmed by immunofluorescence with 
antibodies directed against NS5A (Fig. IB). As with GS4.1 
cells, more than 90% of SLl cells expressed viral 
30 proteins. However, in contrast to Huh7 cell lines where 
the accumulation of HCV RNA declines approximately 100- 
fold when cells become confluent, viral replication in 
HeLa cells was not affected by the growth conditions of 
the cells, i.e., SLl cells continued to produce high 
35 amounts of viral RNA even when they became confluent 
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(results not shown) (Guo, J. T.,. et al., 2001., J. 
Virol. 75:8516-8523; Pietschmaim, T. , et al. 2001. J. 
. Virol. 75:1252-1264) . 

5 Adaptation of HCV replicons. 

To determine whether HCV replication in HeLa cells 
led to the selection of subgenomes with cell-type- 
specific adaptive mutations, the efficiency by which 
G4 18 -resistant colonies formed in Huh7 and HeLa cells 

10 transfected with total RNA isolated from GS4.1 and SLl 

cells was compared. Total KMA from GS4.1 cells led to the 
selection of approximately 166 G418-resistant colonies 
per ng of viral RNA in Huh7 cells compared with only 4 
colonies in HeLa cells (Table 1) . In contrast, total EHA 

15 from SLl cells yielded 160 colonies in HeLa cells 
compared with about 20 in Huh7 cells. These results 
indicated that replication in HeLa cells led to the 
selection of variants with cell-type-specific adaptive 
mutations that were responsible for the 40-fold increase 

20 in colony formation efficiency between amplified KWA in 
GS4.1 and SLl cells. Nucleotide sequence analysis of HCV 
cDNA clones obtained from the SLl and SL2 cell lines 
confirmed this view. These data showed that replicons in 
the two HeLa cell lines maintained the previously 

25 identified adaptive mutations in GS4.1 cells and acquired 
several additional mutations that resulted in amino acid 
changes in the NS region (Figure 2 and Table 2) . Notably, 
some of the new mutations formed clusters, in the NS4B and 
NS5A regions. In the case of SLl cells > a deletion of 43 

30 amino acids near the C terminus of NS5A was observed. Of 
particular interest were mutations in the amino-terminal 
region of NS4B, because they have so far not been found 
in cDNAs from replicons in Huh7 cells and hence could 
have been responsible for the observed adaptation of 

35 replicating RNA (Blight, K. J., et al . 2000. Science 
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290:1972-1975; Guo, J. T. , et al., 2001., J. Virol. 
75:8516-8523; Krieger, N., et al., 2001. J. Virol. 
75:4614-4624; Lohmaim, V., et al., 2001. J. Virol. 
75:1437-1449). Moreover, one mutation at position 1749 
was present in both SLl and SL2 cells. In contrast to the 
results obtained with the WS regions, no mutations were 
detected in the 5' and 3' untranslated regions of 
replicons expressed in SLl and SL2 cells. 



Table 1. Colony formation efficiency of total cellular 



No. of colonies in transfected cells 



Cell Viral RNA Mean (SD) Colonies/ Mean (SD) Colonies/ Mean (SD) Colonies/ 
20 (ng/10 lag) ng of ng of ng of 

viral RNA viral BNA viral RNA 

GS4.1 5 834 (64) 165 22 (4) 4 0 <1 

SLl 5 100 (53) 20 803 (81) 160 1.3 (1.5) <1 

25 MHl 0.5 20 (2) 40 66 (9) 132 1.7 (0.6) 3 

* Results from Uiree independent transfeotion experiments. Total ENA was 
extracted from GS4.1, SLl, and MHl cells at passages 21, 26, and 4, 
respectively. 
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TABLE 2. Consensus mutations in replicons isolated 
from HeLa and mouse hepatoma cell clones 

Cell clone Conserved intttation(s) NS protein 



GS4.a E1202G WSi 

S2204I, D2254E, I2324V NS5A 

SLl Q1067R, S1128A, E1202G, S1323P, S1560G^ NS3 

U701F KS4A 

Q1720R, Q1727H, V1749A, V1893L NS4B 

T2035A, S2204I, I2252V, D2254E, 12274V, R2290Ii, 

I2324V, del. 2371-2413" NS5A 

W2990R NS5B 

SL2 I1097V, Q1112R, P1115L, V1593M, M1647I NS3 

L1715P, Q1737R, V1749A, I1797V, N1965Y NS4B 

Q2012L, S2204I, E2247G, D2254B, K2302R, I2324V, 

S2336P, Ii2400S, E2411Q, A2412V NS5A 

MHl Q1067R, S1128A, E1202G, S1323P, S1560G NS3 

Q1720R, Q1727H, V1749A, V1893L NS4B 

T2035A, S2204I, I2252V, D2254E, I2274V, R2290I., 

I2324V, M2388T, T2496A NS5A 

W2990R NS5B 

MH2 Q1112R, E1202G,= S1323P, S1560G NS3 

L1701F NS4A 

Q1720R, Q1727H, V1749A, V1893L NS4B 

T2035A, T2185A, S2204I. I2252V, D2254E, I2274V, 

R2290^, I2324V" NS5A 

W2990R^ NS5B 

MH4 Q1067R, S1128A, E1202G, S1323P, S1560Q NS3 

L1701F NS4A 

Q1720R, Q1727H, V1749A, V1B93L, A1841T NS4B 

T2035A, S2204I, I2252V, D2254E, I2274V, R2290L, 

I2324V, T2364M, L2391R NS5A 

12 843V, W2990R NS5B 

* Mutation that occurred in 50% of the clones analyzed. 
" del. , deletion. 

Mouse hepatoma cells can support HCV RNA replication. 

The discovery of several additional mutations in 
cDNA clones obtained from SLl and SL2 cells suggested 
total RNA from these cell lines might yield colonies in 
cells that did not appear to be permissive for HCV 
replication after transfection with s-ubgenoinic RNA or 
total KNA from Huh7-derived cell lines. Hepatoma and 
hepatocyte-derived cell lines were examined. G418- 
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resistant colonies were obtained with the mouse hepatoma 
cell line Hepal-6 after transfection with total KMA from 
SLl cells (Figure 3A, lanes 4 to 6, 9, and 12) . As with 
HeLa cells, the amounts of RNA ranged from 300 to 1,000 
5 copies of RNA per cell and a large fraction of the cells 
expressed viral proteins (Figure 3B) . In contrast to Huh7 
and HeLa cells, the amount of HCV RNA in the mouse cell 
lines appeared to vary between cell passages (Figure 3A, 
lanes 6 to 14). Interestingly, total RNA isolated from 

10 one of the mouse cell lines, MHl, did not produce 

significantly more colonies in Hepal-6 cells than did 
total RNA from SLl cells, suggesting that the subgenomes 
present in SLl cells were already adapted for replication 
in the mouse cells (Table 1) . In support of this 

15 interpretation, nucleotide sequence analysis of viral 

cDNAs cloned from three mouse cell lines showed that the 
majority of the mutations identified in SLl cells were 
maintained (Figure 2) . Surprisingly, the deletion in NS5A 
identified in four of four clones sequenced from SLl 

20 cells was not present in replicons isolated from mouse 
cells, indicating that a subpopulation of replicons 
without the deletion was still present in these (SLl) 
cells . 

In further experiments, mouse hepatocyte cells AML12 
25 (ATCC CRL-2254) were transfected with total RNA isolated 
from the cell line 6S4.1, expressing subgenomic replicons 
and from cell line HFL expressing full-length HCV 
genomes, respectively. G418 resistant colonies were 
isolated to establish stable cell lines expressing HCV 
30 suggenomic and full-length replicons. 5 micrograms of 
total RNA was isolated from AML12 cell lines (MA6-5 to 
MA6-8, and MACl) that were established from G418 
resistant cell colonies and analyzed by Northern blot 
analysis, which confirmed replication of HCV (Figure 4) . 
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Cell-derived HCV RNA is more efficient than in vitro- 
transcribed RNA in initiating replication in HeLa and 
mouse hepatoma cells . 

Results showed that replication of HCV subgenomes in 

5 HeLa and mouse cells led to the selection of replicons 
with several novel mutations. The majority of these 
mutations were located in the NS3, NS4B, and NS5A 
regions. Moreover, the results showed that cell-derived 
RNA carrying some or all of these mutations was much more 

10 efficient in establishing 6418-resistant colonies in HeLa 
cells than was RNA derived from Huh7 cells (Table 1) . 

Based on these observations, it was surmised that 
introduction of these mutations into available subgenomic 
replicons should alter or expand their tissue and host 

15 tropism. To test this hypothesis, 13 subgenomic replicons 
were designed that carried mutations in NS3 , NS4B, and 
NS5A alone or in combination with each other as described 
in Table 3. Of the 13 constructs examined, only two, pZS2 
and pZS25, yielded a small number of G418- resistant 

20 colonies in HeLa cells (Table 4) . Viral KMA replication 
was confirmed by Northern blot analysis of total RNA 
isolated from six cell lines derived from those colonies. 
None of the variants yielded colonies in Hepal-6 cells. 
Moreover, negative-control experiments with in vitro- 

25 transcribed RNA derived from a variant containing a 

frameshift mutation in NS5B did not yield any colonies 
that could be expanded into cell lines. Notably, save for 
one, all replicons were permissive for replication in 
Huh7 cells, albeit with significantly different 
■30 efficiencies (Table 3). Interestingly, both pZS2 and 

pZS25 carried mutations in NS4B that were conserved in 
replicons from two independent HeLa cell lines, SLl and 
SL2. In addition, these replicons had the S2204I mutation 
in NS5A that was previously found to be one of the most 

35 potent adaptive mutations for HCV replication in Huh7 
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cells . Because both replicons replicated very efficiently 
in Huh7 cells, the results suggested that the NS4B 
mutations could have contributed to the observed 
ejcpansion of the tissue tropism of HCV replicons. In 
5 support of this hypothesis, the STobgenome with the 

highest efficiency in HuhV cells, pZSll lacking mutations 
in NS4B {Table 3 ) , did not yield any colonies in HeLa 
cells. 

10 TABLE 3. Colony formation efficiency of innnitant replicons 
in Hxih? cells 



15 





Conserved mutation (s) 


Meem (SD) CFE/ 
pg of RI1A° 


NS protein (s) 
affected 


1377/NS3-3 ' 


na" 


2.3 (1.5) 


NA 


pZSlO 


Al' 


3.3 (3.5) 


NS3 


pZSl 


Cl" 


1.4 X 10^ (3.9 X 10^) 


NS5A 


PZS20 


B" 


1.7 (0.6) 


NS4AB 


pZSll 


Al + CI 


2.4 X 10^ (7 X 10*) 


NS3, NS5A 


pZS2 


B + CI 


7.8 X 10* (1.7 X 10*) 


NS4AB, NS5A 


pZS12 


Al + B 


0.3 (0.6) 


NS3, NS4AB 


pZS5 


Al + B + CI 


15 • 


NS3, NS4AB, NS5A 


pZS4 


B + CI + Aa" 


165 (21) 


NS3, NS4AB, NS5A 


pZS8 


A2 + B 


0 


NS3, NS4AB 


pZS6 


A2 + B + C2^ 


8 (4) 


NS3, NS4AB, NS5A 


pzsis 


C3* 


491 (183) 


NS5A 


pZS25 


B + C3 


1.7 X 10* (2.6 X 10') 


NS4AB, NS5A 


pZS45 


A2 + B + C3 


19 (2) 


NS3, NS4AB, NS5A 



* Al, mutation E1202G. 
" CI, mutations S2204I and D2254E. 
= B, mutations L1701F, Q1720R, Q1727H, and V1749A. 
A2, mutations E1202Q, S1128A, and S1323P. 
35 * C2, mutations S2204I, D2254E, R2290I, and I2324R. 

' C3, all the mutations of C2 plus the deletion 2371-2413. 

' CFE, colony formation efficiency. Values are derived from three independent 

trans factions of each replicon RN7V. 
*" NA, not applicable. 
40 ^ Seciuence ID Numbers for subgenondc replicons are listed in Table 6 
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TABLE 4. Colony formation efficiency of in vitro- 
transcribed RNA^ 



Source of cDNA No. of colonies in transf acted cells 

5 library (cell line Huh7 HeLa Hepal-6 

or plasmid) 



GS4.1 


>10,000 


0, 


0, 


0 


0, 


1, 


0 


SLl 


>10,000 


0, 


3, 


2 


0, 


1, 


0 


MH4 


>10,000 


3, 


4, 


0 


17 


0 


0 


pZS2 


>10,000 


2, 


3, 


0 


0, 


0, 


0 


PZS25 


>10,000 


0, 


2, 


1 


0, 


0, 


0 



15 ^ Results from transfection experiments with in 

vitro-transcribed RNA from pooled clones isolated from 
the indicated cell line and from in vitro-transcribed RNA 
•from pZS2 and pZS25 (Table 3) . 

20 To further e3?plore the basis for the observed low 

colony formation efficiency of in vitro-transcribed RNA 
in HeLa cells, it was determined if replication in HeLa 
cells led to the selection of adaptive mutations that 
were not discovered previously when cDNA clones from SLl 

25 and SL2 cells were sequenced. For this puorpose, cDNA 

clones were isolated from total RNA obtained with pZS2- 
and pZS2 5 -derived cell lines, respectively. Nucleotide 
sequence analysis of both cDNA clones did not reveal any 
additional consensus mutations, suggesting that the two 

30 subgenomes were sufficiently adapted for replication in 
HeLa cells (results not shown) . However, as mentioned 
above, it was possible that a minor population of 
subgenomic replicons with additional mutations were 
present in these cell lines. To overcome this problem, a 

35 method for the isolation and cloning of cDNAs spanning 
the WS3 to NS5B region was developed (see Materials and 
Methods) . Replicon cDNA libraries were produced from 
GS4.1, SLl, and MH4 cells. Approximately 2,000 cDNA 
44 
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clones were pooled and subsequently used for in vitro 
transcription of subgenomic RNA. With Hiih7 cells, the 
colony formation efficiency of the pooled clones was 
comparable to that of the most efficient subgenomes, such 
5 as pZS2 or pZS25, and did not vary significantly with the 
origin of the total KNA used for cDNA cloning (Table 4) . 
Consistent with previous results, colony formation in 
HeLa and mouse cells was origin dependent, i.e., save for 
one case, colonies were observed only with clones derived 

10 from SLl and MH4 cell lines. Notably, with this strategy 
G418-resistant colonies were obtained for the first time 
with Hepal-6 cells by using in vitro-transcribed ENA. To 
confirm the presence of viral RNA, 11 colonies were 
expanded and Northern blot analysis was performed with 

15 total RNA. All 11 RNA samples analyzed contained viral 
RNA ranging from approximately 0.1 to 1 ng/5 ]ig of total 
RNA (results not shown) . 

Taken together, the results supported the hypothesis 
that mutations identified in subgenomic replicons 

20 expressed in HeLa and mouse cells play a role in 

adaptation of the replicons to certain cell-type-specific 
conditions . 

DISCUSSION 

25 HCV is known as a species- and tissue-specific 

virus. The results described herein show that replication 
of HCV can occur in cells derived from tissues other than 
liver, indicating that cellular factors required for RNA 
replication are expressed in cell types other than 

30 hepatocytes. One interpretation of this result' is that 

the apparent tropism of HCV for hepatocytes is determined 
primarily at the level of virus entry or assembly or, 
alternatively, that HCV can infect many other tissues but 
has escaped detection due to very low amounts of RNA 

35 replication or acciomulation. Extrahepatic tissues could 
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serve as reservoirs for HCV that, as with human 
imm-unodeficiency virus, could provide a source of viruses 
that are refractory to antiviral therapy and, 
importantly, can be responsible for infection of liver 

5 grafts following orthotopic liver transplantation (Chun, 
T. W. , et al., 2002. J. Infect. Dis. 185:1672-1676; 
Laskus, T., et al., 2002. J. Infect. Dis. 185:417-421). 
Such a scenario would have profound implications for 
antiviral therapy. For example, the targeting of drugs to 

10 secondary sites of viral replication and the analysis of 
drug metabolism in cells other than hepatocytes would 
become important factors for the development of 
successful antiviral therapies. 

It is conceivable that HCV quasispecies in 

15 hepatocytes and other tissues exhibit differences in 

their composition due to the selection of variants with 
cell- type-specif ic adaptations. As shown in this Example, 
replication of subgenomes in HeLa cells led to the 
accumulation of clusters of mutations in the NS3, NS4B, 

20 and NS5A regions including a deletion in NS5A (Figure 2) . 
Mutations and deletions in NS5A have been found 
previously in genomes that replicated in Huh7 cells, 
which could suggest that expression of the natural form 
of this protein in cell culture somehow interferes with 

25 RNA replication (Blight, K. J., et al. 2000. Science 
290:1972-1975; Guo, J. T. , et al., 2001., J. Virol. 
75:8516-8523; Ikeda,. M. , et al., 2002. J. Virol. 76:2997- 
3006; Lohmann, V., et al., 2003. J. Virol. 77:3007-3019; 
Lohmann, V., et al . , 2001. J. Virol. 75:1437-1449). 

30 However, mutations in the amino terminus of NS4B have 
previously not been observed. Notably, in both SLl and 
SL2 cells, the mutations changed two or one glutamine 
residues, respectively, to one of the two basic amino 
acids arginine and histidine. Moreover, the mutation 

35 V1749A was present in all five cell lines examined (Table 
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2 and Figure 2) . Thus far, these results show that these 
mutations appear to be required for replication in HeLa 
cells, because only replicons pZS2 and pZS25 carrying 
these mutations yielded colonies after transfection with 
5 in vitro- transcribed RNA (Tables 3 and 4) . The amino 
terminus of NS4B is predicted to reside on the 
cytoplasmic side of endoplasmic reticulum membranes and 
may interact with other host or viral proteins required 
for RNA replication (Hugle, T., et'al., 2001. Virology 

10 284:70-81). As an integral endoplasmic reticulum membrane 
protein, NS4B might provide a scaffold for the assembly 
of replication complexes and act as a regulator for RMA 
replication. More importantly, a recent study revealed 
that NS4B can induce particular membrane structures, 

15 called membranous webs, proposed to be the site for HCV 
replication (Egger, D., et al . , 2002. J. Virol. 76:5974- 
5984) . Interestingly, genetic analyses with an HCV- 
related pestivirus identified the amino-terminal region 
of NS4B as a determinant for cytotoxicity caused by high 

20 levels of virus replication (Qu, L., et al., 2001. J. 
Virol. 75:10651-10662) . 

In stimmary, this example demonstrates that HCV RNA 
replication is not restricted to the human hepatoma cell 
line Huh7 but instead occurs in HeLa cells and hepatoma 

25 cells derived from mice. These findings further 
facilitate development of a mouse model for HCV 
infection. 

Exang)le II KHA Polymerase Inhibitors Inhibit HCV 
30 replication in Transformed HeLa Cells 



In this example, the anti-HCV activity of 2 ' -C- 
methyladenosine (2CMA, an HCV KNA polymerase inhibitor) 
was tested on GS4.1 (Huh7) cells, and SLl (HeLa). The 
35 cells were treated with 10 ]M 2CMA. The cells were 
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harvested at 6, 12, 24, 48, and 72 hours. Total cellular 
RNA was extracted and viral RNA (vRNA) analyzed by 
Northern blot analysis. These results indicate that 2CMA. 
effectively inhibits HCV in GS4.1 and SLl cells (Figure 
5 5) . 

Next, the antiviral activity of the HCV KNA 
polymerase inhibitor 5-OH-cytidine was tested. GS4.1 
(Huh?) cells and SLl (HeLa cells were treated with the 
indicated amoxints of 5-OH-cytidine. The DNA polymerase 

10 inhibitor 5-OH-deoxy-cytidine was used as a negative 
control. The cells were harvested 72 hours after 
incubation with the drugs. Total cellular RNA was 
extracted and viral RNA analyzed by Northern blot 
analysis. The intensity of the bands corresponding to HCV 

15 RNA was determined with a Fuji pho spho imager . These 

results indicate that 5-OH-cytidine effectively inhibits 
HCV replication in GS4.1 and SLl cells (Figure 6). 

lExsaaple III Cytopathic and MonCytoPathic Responses 
20 in Cells Eatpressing Hepatitis C Virus 

Currently, combination treatment with alpha- 
interferon and ribavirin is the therapy of choice for HCV 
infection. But this treatment is not always effective, 
and other treatment choices are limited, or have \mproven 

25 efficacy. Study of the mechanism of action of IFN-a may 
help elucidate a new, effective treatment for HCV, or 
help determine what makes HCV treatment effective. 

DNA microarray studies revealed that the antiviral 
response induced by IFNs alters the expression of 

30 hundreds of genes and, hence, is far more complex than 
previously anticipated (Der, S. D. , et al., 1998. Proc. 
Natl. Acad. Sci . USA 95:15623-15628). Little is kaown 
about the nature of the cellular proteins that 
specifically target viral components and, hence, are 

35 responsible for the inhibition of viral replication in 
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the absence of cell death. In contrast, the major signal 
. transduction pathways required for the innate immune 
response against many viruses have been elucidated. The 
first wave of IFN- induced genes depends on the 
5 phosphorylation of STATl and STAT2 and their interaction 
with IRF9 (p48) to form the transcription factor complex 
IS6F3. In addition, viral double stranded RNA (dsRNA) 
and other lanknown viral factors are believed to play an 
important role in the establishment of an antiviral 

10 state. They can activate dsRNA- dependent enzymes such as 
protein kinase R (PKR) and 2 ' , 5 ' -oligoadenylate synthase 
(OAS) , as well as other still-elusive protein kinases 
(Smith, E. J., et al., 2001. J. Biol. Chem. 276:8951- 
8957) . IFW-a can induce a noncytopathic antiviral 

15 response or, alternatively, trigger apoptotic programs 

leading to the elimination of infected cells (Tanaka, N. , 
et al., 1998. Genes Cells 3:29-37). 

Nucleotide . sequence analyses of HCV genomes isolated 
from Japanese patients revealed a correlation between the 

20 presence of mutations in a short segment of NS5A, termed 
the IFN sensitivity-determining region (ISDR) , and 
resistance to antiviral therapy with IFN-a (Enomoto, N. , 
et al., 1995. J. Clin. Investig. 96:224-230; Enomoto, N. , 
et al., 1996. N. Engl. J. Med. 334:77-81). Subsequently, 

25 it was reported that the ISDR motif can bind to PKR 

(Gale, M. J., Jr., et al., 1997. Virology 230:217-227). 
Importantly, the ISDR from IFN-resistant, but not from 
IFN-sensitive, HCV isolates appeared to be a substrate 
for PKR, suggesting that IFN treatment of chronically 

30 infected patients can lead to the selection of HCV 
variants with ISDRs that can bind and inactivate PKR 
(Gale, M. J., Jr., et al., 1998. Clin. Diagn. Virol. 
10:157-162; Tan, S. L., and M. 6. Katze. 2001. Virology 
284:1-12) . 

35 Accordingly, study of HCV variants and the pathway 
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by which IFN inhibits HCV is necessary to provide new HCV 
treatments, and to prevent selection of IFN-a resistant 
variants . 

5 MaTERIMiS AND METHODS 

Chemicals and reagents. Recombinant IFN-a2b (intron A) 
was purchased from Schering-Plough. Cyclohexiraide , 2- 
aminopurine (2-AP) , genistein, sodium salicylate, and 
wortmannin were obtained from Sigma. SB 203580, PD 98059, 
10 vanadate, PP2, rapamycin, and lactacystin were obtained 
from Calbiochem. Epoxomicin was obtained from Boston 
Biochem, and caspase inhibitor ZVAD- fluoromethyl ketone 
(ZVAD-FMK) was obtained from Enzyme Systems Products. 

15 Cell culture. Huh7 and HeLa cells were maintained in 

Dulbecco's modified Eagle's medium supplemented with 10% 
fetal bovine serum, penicillin G, streptomycin, 
nonessential amino acids, and L-glutamine. For the cell 
lines carrying HCV and Kunjin virus replicons, 500 pg of 

20 G418/ml was added to the medium. The GS4.1 cell line was 
derived from a s-ubclone of FCAl cells as described 
previously {Guo, J. T., et al., 2001. J. Virol. 75:8516- 
8523) . SLl is a HeLa cell line expressing HCV subgenomic 
replicon I377NS3-3' {Lohmann, V., et al., 1999. Science 

25 285:110-113; Zhu, Q., et al . , 2003. J. Virol. 77:9204- 
9210) . The KUNCD20 cells represent a pool of 
approximately 200 colonies of G418-resistant HeLa cells 
obtained after transfection with the Kionjin visrus 
replicon C20DXrepNeo ENA (Khromykh, A. A., and E. 6. 

30 Westaway. 1997. J. Virol. 71:1497-1505) (kindly provided 
by A. Khromykh, Sir Albert Sakzewski Virus Research 
Center, Brisbane, Australia) . 

Plasmids . pCMV-E3L expressing the vaccinia virus E3L 
35 protein was obtained from Robert Schneider, New York 
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University, New York. pln035 expressing virus-associated 
(VA) KNA was obtained, from David Lazinski, Tufts 
University, Boston, Mass. pEF-HA-HPIV2 expressing the V 
protein of human parainf luenzavirus 2 {HPIV2) was 
5 obtained from Curt Horvath, Mount Sinai School of 

Medicine, New York, N.Y. To obtain cDNA clones of the 
gene encoding human Mx-1, Huh7 cells were treated with 
100 lU of IFN-a/ml for 6 h and total cellular ENA was 
extracted with TRIzol reagent (Invitrogen) and first- 

10 strand cDNA was made with an oligo (dT) i2_i8 primer and 

Superscript II DNA polymerase (Invitrogen) by following 
the manufacturer's direction. For the amplification of 
Mx-A cDNA, the primers used were 5 ' - 
A6TATCGTGGTAGAGAGCTGC-3 ' {SEQ ID NO: 15) and 5'- 

15 TAATACGACTCACTATAGGGATGTGGCTGGAGATGC-3 ' (SEQ ID NO: 16). 

The purified PGR fragment was cloned into the pGEM-T Easy 
vector (Promega) . The identity of the cloned fragment was 
verified by nucleotide sequence analysis . 

20 RNA extraction and Northern blot hybridization. Total 
cellular RNA was extracted with TRIzol reagent 
(Invitrogen) by following the manufacturer's direction. 
Five micrograms of total RNA was fractionated on a 1% 
agarose gel containing 2.2 M formaldehyde and transferred 

25 onto nylon membranes. Membranes were hybridized with 
riboprobes specific for plus-stranded HCV replicon RNA 
and Mk-A and |3-actin mRMA in the conditions described 
previously (Quo, J. T., et al., 2001. J. Virol. 75:8516- 
8523) . 

30 

Detection of eIF-2a phosphorylation .by Western blotting. 
For Western blot analysis of eIF-2a phosphorylation, 
cells were treated with 100 lU of IFN-a/ml for 12 h and 
then transfected with 2 lag of poly(I:C) per 50-mm- 
35 diameter plate by using Lipof ectamine (Invitrogen) . After 
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3 h of incubation, cells were lysed in Mgh-salt 
radioimmunoprecipitation assay buffer (50 inM Tris-HCl [pH 
8.0], 250 niM NaCl, 1% NP-40, 0.5% deoxycholate, 0.1% 
sodium dodecyl sulfate [SDS] ) . Proteins (40 ]ig) were 
5 separated on SDS-10% polyacrylamide gel and 

electrophoretically transferred to nitrocellulose 
membranes. Membranes were incubated with 50% methanol, • 
washed extensively with water, and blocked with 3% casein 
in TNET buffer (10 iriM Tris-HCl [pH 7.5], 150 iriM NaCl, 1 

10 rrtM EDTA, 0.05% Tween 20). Membranes were incubated with 
rabbit polyclonal antibodies against eIF-2a (a gift from 
Robert Schneider, New York University) or phosphorylated 
eIF-2a (eIF-2a-P; Research Genetics, Inc.) diluted in 
blocking solution for 1 h and then washed extensively 

15 with TNET buffer. Membranes were then incubated with 
horseradish peroxidase-conjugated goat anti-rabbit and 
immunoglobulin G (IgG) (Amersham) , respectively. The 
bound IgG was detected with Super-Signal 
chemi luminescence reagents (Pierce) . 

20 

RPA. For the analysis of IFN-(3 gene expression, cells 
were treated with 100 lU of IFN-a/ml for 12 h and then 
trans fee ted with 2 \ig of poly(I:C) per 60 -ram plate by 
using Lipofectamine (Invitrogen) . After 3 h of 

25 incubation, total cellular RNA was extracted with TRIzol 
reagent (Invitrogen) , and IFN-p mRNA levels were 
determined by RNase protection assay (RPA) with the help 
of the RPAII kit purchased from Ambion. The probes 
complementary to IFN-p (GenBank accession no. M25460) and 

30 p-actin (GenBank accession no. BC013380) mRNAs spanned 
positions 272 to 650 and 1030 to 1250, respectively. 

Annexin V-FITC staining. SLl cells were plated on 
coverslips in six-well plates 16 h prior to treatment. 
35 Cells were then mock treated or treated with 100 lU of 
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IFN-a/ml in the absence or presence of 20 pM ZVAD-FMK. 
Coverslips were then put on slides and incubated with 100 
pi of staining solution containing annexin V-f luorescein 
isothiocyanate (FITC) at room temperature for 10 to 15 
5 min. After extensive washes with phosphate-buffered 
saline, slides were examined with a Nikon fluorescence 
microscope and photographed with a charge coupled device 
camera . 

10 Flow cytometry analysis. To detercnine the fraction of 
apoptotic cells, the annexin V assay system {Roche 
Diagnostics GmbH) was used. Cells were incubated with 
IFN-a (100 lU/ml) alone or together with 20 pM ZVAD-FMK 
for 24 h. The culture medium containing detached cells 

15 was collected, and the adherent cells were trypsinized 

and then combined with the detached cells. The cells were 
collected by centrifugation and washed once in phosphate- 
buffered saline. Pelleted cells were resuspended in 
binding buffer and were inciibated with annexin V-FITC at 

20 room temperature for 10 to 15 min. The stained cells were 
then diluted with binding buffer and analyzed by flow 
cytometry (FACScan; Becton Dickinson) . 

RBSUIiTS 

25 IFN-g can induce noncytopathic and cytopathic antiviral 
responses in cells comprising HCV replicons. 

As set forth above, stable HeLa cell lines were 
established that express HCV subgenomes with an 
efficiency similar to that of Huh7 cells (Zhu, Q., et 

30 al., 2003. J. Virol. 77:9204-9210). Examination of the 
IFN-a response in HeLa derived cell lines such as SLl 
revealed a very similar dose dependent reduction of virus 
replication. The IC50 of IFN-a in HeLa cell lines was 
generally in the range of 0.1 lU/ml, approximately 10- 

35 fold lower than the IFN-a IC50 in Huh7 -derived cell lines 
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(Figure 7) . However, in marked contrast to observations 
made with HuhV cell lines, treatment of SLl and other 
HeLa derived cell lines with more than 30 lU of IFN-a/ml 
induced cell death in a significant fraction of cells 
5 between 6 and 20 h post treatment (Figure 8A) . Cell death 
was caused by apoptosis, as determined by annexin V 
staining, and could be prevented by the caspase inhibitor 
ZVAD-FMK. The fraction of apoptotic cells was determined 
before and after treatment with the cytokine. The results 

10 showed that IFN-a induced apoptosis in more than 30% of 
SLl cells compared with 6 to 7% in untreated SLl and 
parental HeLa cells (Figure 8B) . Several other HeLa- 
derived cell lines were examined to assure that the 
results obtained with SLl cells reflected a general 

15 property of HeLa cells expressing HCV subgenomes . 

Moreover, to test more directly whether IFN-induced 
apoptosis was caused by viral replication, two methods 
were used to inhibit RNA replication in SLl cells. The 
first was based on the observation that replication of 

20 HCV subgenomes is temperature sensitive and is inhibited 
at 39°C (J. A. Sohn and C. Seeger, unp-ublished 
observations) . The second method relied on the 
availability of an inhibitor of the viral RNA polymerase. 
Consistent with a role for viral replication in the 

25 induction of apoptosis, cell death could be prevented 

when viral replication was inhibited by either incubation 
of the cells for 60 h at the elevated temperature or 
treatment with a viral polymerase inhibitor (Figure 8B 
and result not shown) . 

30 These results raised the question of whether IFN- 

induced apoptosis reflected a general property of HeLa 
cells expressing viral replicons. To address this 
problem, the IFN response against HCV in SLl cells was 
compared with that against the flavi virus Kunjin virus in 

35 HeLa cells. For this purpose a pool of HeLa cells. 
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KIMCD20, expressing K\mjin virus subgenomic replicons 
lacking the structural genes, similar to the HCV 
sxobgenomic replicons was established (Khromylch, A. A. , 
and E. G. Westaway. 1997. J. Virol. 71:1497-1505). The 
5 Kunjin virus RNA levels in these cells were approximately 
fivefold higher than those obseirved with HCV in SLl 
cells. In contrast to results with SLl cells, treatment 
of KONCD20 cells with different concentrations of IFN-a 
only slightly inhibited viral replication (Figure 9) . 

10 Iinportantly, cell death in IFN-o£- treated KUKrCD20 cells 
was not detected either by light microscopy or annexin V 
staining (results not shown) . These results indicated 
that IFN- induced apoptosis is a property of HCV- 
escpressing HeLa cells rather than a general property of 

15 HeLa cells replicating viral RNA genomes . 

In summary, these results demonstrated that IFN-a 
could induce noncytopathic as well as cytopathic 
antiviral programs in cells expressing HCV replicons in a 
concentration- and cell type-dependent fashion. Moreover, 

20 the results showed that this antiviral program was 
specific for HCV replicons. Importantly, the results 
suggested that HCV replication could induce an innate 
cellular response that, in combination with IFN-a, could 
lead to apoptosis. 

25 

IFN-g inhibits HCV replication through the Jak-STAT 
signal transduction pathway 

Information about the signal transduction pathways 
responsible for execution of the IFN response has 

30 generally been obtained with cells treated with high 

concentrations (100 to 1,000 lU/ml) of the cytokine and 
with fibroblasts and epithelial cells, most of which 
cannot, to date, support HCV replication. Moreover, a 
recent study by Schlaak, et al. (Schlaak, J. F., et al . , 

35 2002., J. Biol. Chem. 277:49428-49437) revealed that the 
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IFN response could vary in a cell type- dependent manner. 
In addition, it was fo\md that slight changes in cell 
culture conditions had major effects on HCV replication. 
Therefore, the observation that replication of HCV in 
5 both Huh7 and HeLa cells could be inhibited with low 
concentrations of the cytokine warranted a more careful 
study of the pathways involved in the antiviral program 
against HCV. 

To investigate the nature of the IFN-a response 

10 against the HCV replicon, drugs that were known to 
inhibit specific components of selected signal 
transduction pathways that play a role in the antiviral 
response induced by IFN-a were used (Table 5) . The 
current model for the signal transduction pathway induced 

15 by IFN-a predicts that the IFN receptor associated 
tyrosine kinases Jakl and Tyk2 are activated and, in 
turn, phosphorylate the transcription factors STATl and 
STAT2, which are required for the induction of the 
cellular antiviral program (Figure 10) (Sen, G. C. 2001., 

20 Annu. Rev. Microbiol. 55:255-281; Stark, G, R. , et al . , 
1998. Annu. Rev. Biochem. 67:227-264) . Incubation of 
GS4.1 cells with the tyrosine kinase inhibitor genistein 
suppressed the induction of the IFN- induced Mx-1 gene 
(Figure llA and llB) . Similarly, genistein antagonized 

25 the IFN-ot response against the HCV replicon. An increase 
in the concentration of the drug from 100 to 300 ]M led 
to a complete inhibition of the IFN response against HCV 
(results not shown) . Consistent with this result, it was 
fotmd that the V protein of HPIV2 blocked the IFN 

30 response. The V protein of HPIV2 induces the degradation 
of STAT2 and, hence, inhibits the IFN- induced activation 
of gene expression (Parisien, J. P., et al . , 2002. J, 
Virol. 76:4190-4198) (Figures IIC and llD) . IFN-a 
treatment of cells expressing HPIV2 led to a twofold 

35 reduction of viral RNA levels. When adjusted for the 
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observed transfection efficiency, i.e., 40 to 45% of the 
cells express the V protein, the reduction corresponded 
to a complete inhibition of the IFN-o: response. Finally, 
the decline of viral UNA levels was reduced in GS4.1 
5 cells with IFN-a and cycloheximide, indicating that de 
novo, protein synthesis was required for an antiviral 
response against HCV replication {Table 5) . 

IFN-cx can activate, in addition to the STAT pathway, 
MAPKs, including extracellular signal-regulated kinase, 

10 p38 MAPK, and phosphatidylinositol 3 (PIS) -kinase-Akt 

pathways (David, M. , et al., 1995. Science 269:1721-1723; 
Goh, K. C, et al., 1999. EMBO J. 18:5601-5608; Pfeffer, 
L. M., et al., 1997. Science 276:1418-1420). However, in 
contrast to genistein, SB 203580, sodium salicylate, and 

15 wortmannin, known inhibitors of p38 MAPK, NF-kB, and PI-3 
kinase, respectively, did not inhibit the IFN response at 
• detectable levels, suggesting that the two major 
ancillary signaling pathways activated by IFN-a were not 
directly involved in inhibiting HCV replication in Huh7 

20 cells (Table 5; results not shown) . 

In summary, the results showed that inhibition of 
HCV- replication with IFN-a depended on a functional Jak- 
STAT pathway (Figure 10) . Hence, the results demonstrated 
that the IFN response against HCV was genuine and did not 

25 reflect an unspecific effect of the cytokine. 

Does HCV replication induce an antiviral state in 
infected cells? A critical step in activation of innate 
immianity is the induction of an antiviral state by dsKNA 

30 or viral proteins (Figure 10) (Taniguchi, T., and A. 

Takaoka. 2002. Curr. Opin. Immunol. 14:111-116; tenOever, 
B. R. , et al., 2002. J. Virol. 76:3659-3669). As shown 
above, evidence for such a virus-induced activation was 
also obtained from IFN-treated HeLa cells expressing HCV 

35 replicons. To investigate the nature of this HCV-induced 
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activation, the phosphorylation levels of eIP-2a and 
expression of IFN-(3 were determined. eIF-2a is a 
substrate of PKR, which is activated by dsRNA that can 
accumulate as a consecjuence of viral KMA replication 
5 (Srivastava, S. P., et al., 1998. J. Biol. Chem. 

273:2416-2423; Williams, B. R. 3 July 2001, posting date. 
Signal integration via PKR. Sci STKE 2001 :RE2. 
[Online.]). IFN-(3 gene transcription is activated through 
the coordinate actions of three families of transcription 
10 factors NP-kB, IRF3, and ATF2, all of which are activated 
by dsRNA and/or certain viral proteins (Figure 10) 
(Peters, K. L., et al., 2002. Proc. Natl. Acad. Sci. USA 
99:6322-6327; tenOever, B. R. , et al . , 2002. J. Virol. 
76:3659-3669). 

15 First the levels of eIF-2a-P in Huh7, GS4.1, HeLa, 

and SLl cells was determined in the presence and absence 
of dsRNA and IFN-a. The results showed that eIF-2a-P 
levels were not significantly elevated in cells 
expressing HCV replicons (GS4.1 and SLl) compared with 

20 those in their parental cells (Huh7 and HeLa) (Figure 
12A) . Similarly, incubation of cells with IFN-a did not 
induce eIF-2a-P levels. In contrast, transfection of 
cells with poly(I:C), mimicking dsRNA, augmented eIF-2a-P 
levels, particularly in HeLa and SLl cells. Similar 

25 results were obtained when cells were primed with IFN-a 
prior to transfection with poly(I:C) . These results were 
confirmed with several other cell lines derived from Huh7 
and HeLa cells. 

Second, the levels of IFN-(3 mKNA in the four cell 

30 lines was determined under the same conditions described 
above. In agreement with the above results, viral 
replication alone was not sufficient to activate IFN-p 
gene expression in both Hiih7- and HeLa-derived cell lines 
(Figure 12B) . In Huh7 cells and GS4 . 1 and other Huh7- 

35 derived cells expressing HCV replicons, only a weak 
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induction of IFN-p was observed when cells were primed 
with IFN-a and then transf acted with poly(I:C) . In 
contrast, IFN-p transcription was induced in HeLa and SLl 
cells by poly(I:C) alone and particularly in combination 

5 with IFN-a. Remarkably, expression of IFN-p could be 
induced by IFN-a alone in SLl cells but not in HeLa 
cells. Similar results were obtained with the HeLa- 
derived SL2 cell line (results not shown) . 

In summary, the results showed that, while both Huh7 

10 and HeLa cells were competent to activate dsRNA- dependent 
signal transduction pathways, HCV replication alone was 
not sufficient to induce a detectable dsRNA response in 
these cells. This result could indicate that dsRNA either 
does not accumulate during HCV replication or cannot 

15 access PKR and other dsRNA binding proteins. Importantly, 
the results showed that, despite the apparent lack of 
biologically active dsRNA., viral replication could 
activate certain cellular signal transduction pathways 
that could cooperate with IFN-a to activate the 

20 transcription of the IFN-p gene. 

The results presented above favored a model 
predicting that IFN-a inhibited HCV replication by a 
mechanism that was independent of dsRNA-activated 
antiviral pathways. To test this model more carefully, 

25 the IFN response was measured in GS4.1 cells expressing 
the vaccinia virus E3L protein. E3L is known to sequester 
dsRMA and prevent PKR and OAS/Rnase L activation (Chang, 
H. W., et al., 1992. Proc. Natl. Acad. Sci. USA 89:4825- 
4829; Rivas, C et al . , 1998. Virology 243:406-414) 

30 (Figure 10) . Indeed, expression of E3L had no measurable 
effect on the IFN response against HCV (Figure llC and 
IID) . Experiments relying on simultaneous detection of 
E3L and the HCV protein NS5A in the same cell by 
immunofluorescence confirmed that IFN-a could inhibit HCV 

35 replication in cells expressing the E3L protein (results 
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not shown) . Finally, it was foiind that the PKR inhibitors 
2 -amino purine (2-AP) and adenovirus VA RNA did not block 
the IFN response against HCV in H-uhV cells (Table 5) . 

5 TABLE 5. Effects of inhibitors on the activity of IFN-a 
against the HCV replicon^ 

Drug, protein, or HNA Concai Primary target (s) Effect 

10 ~~ ^ 



25 



2-M> 


10 


PKR and other kinases 




Genistein 


300 \M 


Tyrosine kinases 




Cycloheximide 


10 pg/ml 


Translation 




Soaium salicylate 


5 iM 






SB 203580 


20 pM p38 


KftPK 




PD 98059 


50 pH 


MEK kinase 




Vanadate 


50 pM 


Protean phosphatase 




Wortmannin 


100 nM 


PI 3 kinase 




PP2 


50 pM 


src kinase 




Rapamycin 


200 nM 


mTOR, translation 




Lactacystin 


5 \M 26S 


proteasome 


+ 


Spoxamicin 


1 pM 26S 


proteasome 




V protein of HPIV2 




STAT2 




EBL protein 




PKR and OAS 




VA KNA 









" GS4.1 cells were incubated with the Indicated coinpounds for 2 h and then 
the presence of 100 lU of IFN-a/ml for an additional 24 h. Viral RNA levels were 
determined by Northern blot analysis. Assays for V protein, E3L, and VA RNA 
30 are described in the legend to Pig. 5. 

What are the pathways that play a role in the IFN-a 
response against HCV? A major question concerns the 
mechanism by which IFN-a induces the noncytopathic 

35 inhibition of HCV replication. DNA microarray analyses of 
IFN-a-treated GS4 . 1 cells and other HuhV-derived cell 
lines revealed the induction of several classes of genes 
belonging to known signal transduction and protein 
degradation pathways (J. Hayashi and C. Seeger, 

40 unpublished results; Cheney, I. W. , et al . 2002. J. 
Virol. 76:11148-11154) , In particular, several genes 
encoding proteasome subunits and ubicjuitin-like proteins 
were among the genes most highly induced by IFN-a. 
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Notably, kinetic studies of HCV replication in Hxih7 cells 
indicated that replication complexes have a relatively 
short half life, which is further reduced by IFN treatment 
(J.-T. Guo and C. Seeger, unpxiblished observations) . 
5 Therefore, it was surmised that the proteasome could play 
a role in inhibition of HCV replication in IFN-treated 
cells . 

To test this hypothesis, the outcome of combination 
treatment with IFN-a and the proteasome inhibitors 

10 lactacystin and epoxomicin for HCV KNA replication in 
GS4,1 cells was determined. The cells were pretreated 
with different concentrations of the inhibitors for 1 h 
before IFN-a was added for an additional 6 h of 
combination treatment. Then the cells were incubated for 

15 12 h before RNA was isolated and subjected to Northern 
blot analysis (Figure 13A) . The relatively short 
incubation period was necessary because of the known 
toxicity of proteasome inhibitors after longer incixbation 
times. The results showed that HCV RNA levels dropped 70% 

20 within 18 h of IFN-a treatment, whereas in the presence 
of epoxomicin or lactacystin the reduction was only 30% 
(Figure 13B) . Lower doses of epoxomicin than of 
lactacystin were effective, which is consistent with the 
high specific activity of epoxomicin against the 

25 chymotrypsin-like activity of proteasomes (Fenteany, 6., 
and S. L. Schreiber. 1998. J. Biol. Chem. 273:8545-8548; 
Meng, L., et al., 1999. Proc. Natl. Acad. Sci. USA 
96:10403-10408). Treatment with higher doses of 
lactacystin alone led to a slight reduction of HCV KNA 

30 levels . These results were confirmed with a second set of 
experiments . The cells were pretreated with 5 pM 
lactacystin and 1 ]M epoxomicin, respectively, for 1 h 
before IFN-a was added for an additional 6 h of 
combination treatment. KNA was isolated from the treated 

35 cells either 6 or 12 h after incubation with IFN-a 
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(Figure 14) . The results showed that, at both time 
points, viral KNA levels were significantly higher in 
cells that were exposed to the proteasome inhibitors than 
in cells that were treated with IFN-a alone. 
5 Finally, tests were conducted to determine whether 

proteasome activity was required for induction of the IPN 
response or, more directly, for inJaibition of HCV 
replication. To distinguish between these two 
possibilities, first GS4.1 cells were incubated with the 

10 cytokine for 10 h to induce the antiviral response. Then, 
the cells were incubated for 12 h in the presence of 
lactacystin or epoxomicin. Under these conditions, the 
IFN response against HCV remained effective and reduced 
RNA levels to similar extents independently of the 

15 presence of either of the two inhibitors (Figure 15) . 
Thus, these results indicated that the activity of 
proteasomes is required for the induction of the IFN 
response against HCV, but apparently not for direct 
inhibition of viral replication (see Discussion) . 

20 

DISCUSSION 

In this Example, the mechanism of the IFN-a response 
against subgenomic replicons of HCV in Huh7 and HeLa 
cells is investigated. The following conclusions can be 

25 drawn from these investigations. First, it can be 

concluded that IFN-a can inhibit HCV replication by both 
noncytopathic and cytopathic mechanisms. These results 
demonstrating that SLl cells treated with IFN-a (100 
lU/ml) Tinderwent progreummed cell death raised the 

30 question of whether apoptosis contributes to the rapid 
decline of HCV KNA levels observed during the first 48 h 
of IFN therapy (Neumann, A. U. , et al., 1998. Science 
282:103-107) . The answer depends on whether HeLa or Euhl 
cells mimic the scenario in HCV- infected hepatocytes in 

35 vivo. It is known from this and other studies that Huh? 
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cells exiiibit an attenuated response to dsRNA and cannot, 
induce an apoptotic program (results not shown) 
(Keskinen, P., at al., 1999. Virology 263:364-375; 
Lanford, R. E., et al., 2003. J. Virol. 77:1092-1104; 
McNair, A. N. , et al . , 1994. J. Gen. Virol. 75: 1371- 
1378) . In contrast, HeLa cells respond to dsRNA in a 
fashion similar to that in which primary chimpanzee 
hepatocytes respond. For example, treatment of chimpanzee 
primary hepatocyte cultures with poly(I:C) led to the 
induction of IFN-p, as shown in this report with HeLa 
cells {Figure 12) (Lanford, R. E., et al . , 2 003. J. 
Virol. 77:1092-1104). Therefore, it is likely that HeLa 
cells represent a more physiological model for 
hepatocytes in terms of IFN response than Huh7 cells. It 
was notable that only a fraction of SLl cells died after 
treatment with IFN-a. One possibility is that apoptosis 
was induced in cells that replicated above-average levels 
of HCV RNA. In support of this possibility, reduction of 
viral levels by treatment with heat or HCV RNA polymerase 
inhibitors reduced the nxjitiber of apoptotic cells after 
IFN- a treatment (Figure 8 and results not shown) . Based 
on these results it was concluded 'that HeLa cells did not 
undergo apoptosis by default after IFN-a treatment. In 
fact, it appears that apoptosis is a hallmark of HCV- 
replicating HeLa cells, because HeLa cells replicating 
Kunjin virus RNA remained viable after IFN treatment. 
Finally, it appears that, the observation reported here 
represents the first example of IFN-a-induced apoptosis 
of cells replicating an apparently noncytolytic RNA 
virus . 

Second, it can be concluded that the noncytopathic 
response can occur independently of dsRNA-dependent 
pathways. Although these results showed that dsRNA 
response pathways were at least partially functional in 
normal and HCV-replicating Huh7 cells and were intact in 
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HeLa cells, as indicated by poly (I: C) -induced 
phosphorylation of eIF-2a and IFN-p gene transcription, 
viral replication per se did not induce such responses 
(Figure 12) . The expression of the vaccinia virus E3L 

5 protein or treatment of cells with the kinase inhibitor 
2-AP had no measurable effect on the IFN response against 
the HCV replicon (Figure 11 and Table 5) . These 
observations were consistent with the notion that dsRNA- 
dependent antiviral pathways, such as PKR and KNase L 

10 pathways, were not involved in IFN- induced inhibition of 
HCV replication in Huh7 cells. Whether they play a role 
in HeLa cells is not yet known. Efforts to express E3L in 
SLl cells were not successful due to the apparent 
toxicity of the protein, and treatment of HeLa cells with 

15 2-AP for more than 12 h induced apoptosis (results not 
shown) . Importantly, it is not yet known whether IFN- 
induced apoptosis in HCV- escpres sing cells is dependent on 
PKR or other dsRNA- dependent pathways (see below) . 

In summary, the results showed that, while both Huh7 

20 and HeLa cells were competent to activate dsRJSTA- dependent 
signal transduction pathways, HCV replication alone was 
not sufficient to induce a detectable dsKNA response in 
these cells. This result could indicate that dsRNA either 
did not accumulate during HCV replication or was not 

25 accessible to PKR and other dsRNA binding proteins. 

Third, it can be concluded that HCV replication can 
induce innate immune pathways. In SLl and other HCV- 
expressing HeLa cell lines (results not shown) , but not 
in noarmal HeLa cells, IFN-oi induced the expression of 

30 IFN-p. This indicates that HCV replication activated an 
unknown cellular factor, perhaps a viral activated kinase 
as proposed by Smith and colleagues (Smith, E. J., et 
al., 2001. J. Biol. Chem. 276:8951-8957), that, in turn, 
activated one or more transcription factors required for 

35 IFN-p transcriptional activation. Candidates include 
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IRF3, NF-kB, and ATF-2 (Figure 10) . Because normal HeLa 
cells did not undergo apoptosis after IFN-a treatment, it 
can be concluded that HCV expression directs the IFN-a 
response into a cytopathic process. 

5 Fourth, it can be concluded that the antiviral 

activity of IFN-a against HCV depends, in part, on 
functional proteasomes . How IFN- induced antiviral 
programs inhibit viral replication noncytopathically is 
not yet understood. The results shown here demonstrate 

0 that, for HCV replication, proteasomes were required for 
this process. However, the idea that proteasomes were 
directly involved in inhibition of HCV replication by 
increasing the turnover of replication complexes or viral 
proteins was not supported by these results . Instead 

5 evidence was obtained that induction of the IFN response 
was dependent on degradation of one or several proteins . 
Previously, it has been shown (Li, X. L., and B. A, 
Hassel. 2001. Cytokine 14:247-252) that proteasome 
inhibitors attenuated the induction of certain IFN- 

0 stimulated genes. Because epoxomicin and lactacystin did 
not inhibit induction of Mx-A (results not shown) , which 
is dependent on activation of the Jak-STAT pathway for 
the formation of ISGF3, proteasomes might be involved in 
the induction of the second-wave IFN-stimulated genes. 

5 Such a model is consistent with results published 

previously by Li and Hassel (Li, X. L., and B. A. Hassel. 
2001. Cytokine 14:247-252), who found that treatment of 
cells with proteasome inhibitors did not inhibit 
phosphorylation of STATl and binding of ISGF3 to DNA. As 

0 a consequence of these results, the nuirODer of IFN-induced 
genes that play a role in inhibition of HCV replication 
by IFN-a can be reduced to those that are repressed by 
epoxomicin. 

An important implication of these results for 
5 clinical IFN-a therapy and the pathogenesis of HCV 
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infections is that besides the noncytopathic antiviral 
effects, IFN-cx might also induce apoptosis of HCV- 
infected hepatocytes. At first glance, this possibility 
might be discoimted because drug- induced cell death could 

5 lead immediately to the destruction of the infected 
liver. However, it is possible that only a fraction of 
hepatocytes express levels of HCV high enough to activate 
an apoptotic program in the presence of the high levels 
of IFN-a that are used for antiviral therapy. In this 

10 scenario cell death would occur unnoticed. An important 
consequence of such a scenario would be that cell killing 
could play a major role in the recovery from chronic HCV 
infections, siitiilar to the situation in natural recovery 
from transient infections with woodchuck hepatitis virus, 

15 a model for hepatitis B virus infections (Guo, J. T. , et 
al., 2000. J. Virol. 74:1495-1505). 



Table 6. Listing of Sequence ID Numbers 



Seqiience 


Sequence ID madb 


I377/NS3-3' 


SEQ ID N0:1 


pZSl 


SEQ ID NO: 2 


pZS2 


SEQ ID NO: 3 


pZS4 


SEQ ID NO: 4 


pZSS 


SEQ ID NO: 5 


pZS6 


SEQ ID NO: 6 


pZS8 


SEQ ID NO: 7 


pZSlO 


SEQ ID NO: 8 


pZSll 


SEQ ID NO: 9 


pZSl2 


SEQ ID NO: 10 


pZSl5 


SEQ ID NO: 11 


pZS20 


SEQ ID NO: 12 


pZS25 


SEQ ID NO: 13 


pZS45 


SEQ ID NO: 14 


Mx-A cDNA primer #1 


SEQ ID NO: 15 


Mx-A cDNA primer #2 


SEQ ID NO: 16 
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While certain preferred embodiments of the present 
5 invention have been described and specifically 

exemplified above, it is not intended that the invention 
be limited to such embodiments. Various modifications 
may be made to the invention without departing from the 
scope and spirit thereof as set forth in the following 
10 claims. 
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WHAT IS CLAIiMED IS: 

1. A cell-line which replicates hepatitis C virus 
(HCV) , wherein said cell line is selected from the group 

5 consisting of a non-human cell line and a hToman non- 
hepatic cell line. 

2. The cell line of claim 1, wherein the human non- 
hepatic cell line comprises epithelial cells. 

10 

3 . The cell line of claim 2 , wherein the human 
epithelial cells are HeLa cells. 

4. The cell line of claim 1, wherein the non-hum.an 
15 cell line comprises mouse cells of hepatic origin. 

5. The cell line of claim 4, wherein the mouse 
cells are Hepal-6 cells. 

20 6. The cell line of claim 4, wherein the mouse 

cells are AML12 cells. 

7 . A non-human, non- chimpanzee host animal 
comprising cells which replicate HCV. 

25 

8. The non-human host animal of claim 7, which is a 
mouse. 

9. A method for producing a htiman non-hepatic cell 
30 that replicates HCV, comprising: 

a) obtaining total RNA from a human hepatic 
cell culture that replicates HCV, said total RNA 
comprising a selection marker which renders cells 
expressing said EMA resistant to a selection agent; 
35 b) introducing the total RNA into human non- 



68 



wo 2004/055216 



PCT/US2003/039722 



hepatic cells; and 

c) selecting those cells which grow in the 
presence of said selection agent and replicate HCV. 

5 10. The method of claim 9, wherein a cell line is 

generated from the cells of step c) . 

11. A method of producing a non-human hepatic cell 
that replicates HCV, comprising: 

10 a) obtaining total RNA from a human non-hepatic 

cell culture that replicates HCV, said total RNA 
comprising a selection marker which renders cells 
expressing said RNA resistant to a selection agent; 

b) introducing the total RNA into non-human 
15 cells; and 

c) selecting those cells which grow in the 
presence of said selection agent and replicate HCV. 

12. The method of claim 11, wherein a cell line is 
20 generated from the cells of step c) . 

13 . A method for screening test compoTinds which 
inhibit HCV replication, comprising: 

a) culturing the cell line of claim 1 in the 
25 presence and absence of a test compound; and 

b) assaying HCV replication levels in the 
presence and absence of said test compound, wherein a 
reduced HCV replication level in the presence of said 
test compound is indicative that said test compoTond 

30 inhibits HCV replication. 

14. An HCV polynucleotide having at least one of 

the mutations shown in Table II . 

35 15 . A polyprotein encoded by the polynucleotide of 
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claim 14. 

16. A method for screening test compounds which 
modulate the antiviral response induced by interferon 

5 alpha (IFN-a) comprising 

a) culturing the cell line of claim 1 in the 
presence and absence of a test compotmd; 

b) contacting the cells of step a) with IFN-a; and 

c) measuring the HCV replication level in the 

10 presence and absence of said coinpound thereby identifying 
agents which modulate the antiviral response mediated by 
IFN-a as a function of altered HCV levels. 

17. The method of claim 16, wherein the antiviral 
15 response is enhanced. 

18. The method of claim 16, wherein the antiviral 
response is inhibited. 
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Figures 1A-1B 
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Figure 2 
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Figures 3A-3B 
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Figures 7A-7B 
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Figures 8A-8B 
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Figures 9A-9B 
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Figure 10 
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Figures 11A-11D 
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Figures 12A-12B 



12A 



12B 



Cells Huh7 GS4.1 HeLa SL1 

IFN-a - + - + -+ -+ -+ - + - + - 

Poly(l:C) - - + + .- ++ -.++-- + 
eiF-2a-P -I**-* 
Total eiF-2 

1 23456789 10 11 1213 14 15 16 

Cells Huh7 GS4.1 HeLa SL1 

Poly(l:C) - + .+ . + -+ - + -+ - + - + 

IFN-a - - ++ - - ++ - - ++ -- + + 

iFN-p - ^ . . 

■ 'W' ■ '-^^ 

p-actin— 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 IS 16 
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Figures 13A-13C 
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Figures 14A-14B 




14B 



30000 - 




JFN-a . + - + . + 

Epoxomicin - - - . + + 
Lactacystin - - + + - 
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Figures 15A-15B 



IFN-a 

Lactacystin 
Epoxomicin 
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SEQUENCE LISTING 

<110> Qing Zhu 

Ju-Tao Guo 
Christoph Seeger 

<120> Replication of Hepatitis C Virus in 

Non-Hepatic Epithelial and Mouse Hepatic Cells 



<130> 0149 PO3068 

<140> Not Yet Assigned 
<141> 2003-12-12 

<150> 60/433,303 
<151> 2002-12-13 

<160> 16 

<170> FastSBQ for Windows Version 3.0 

<210> 1 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasmid 
<400> 1 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 
cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 
acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 
cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 
aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 
cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 
gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 
tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 
ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 
agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 
gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 
cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 
ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 
aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 
gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 
aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 
gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 
tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 
atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 
aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 
atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 
agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 
acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 
ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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1320 
1380 
1440 
1500 
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1680 
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1800 
1860 
1920 
1980 
2040 
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caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 222 0 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg fctgcgaaggc ggtggacttt gtacccgtcg agtctatgga aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 252 0 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2 82 0 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 28 80 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3 600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3 6S0 

atggcatgca tgtcggctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtccttta ccgggagttc 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacagggaat gcagctcgcc 3900 

gaacaattca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg tggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4 080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 462 0 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgfcggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctagccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggactct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga fctacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg acgtcgacag cggcacggca 582 0 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagfccgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 
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6060 



acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgcfccaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca cfcgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 840 0 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 912 0 

caggtagatg acgaccatca gggacagcfct caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggnac ctcgctaacg 9360 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 942 0 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacadctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 



wo 2004/055216 



PCT/US2003/039722 



acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac fcgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10S80 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggataccfcgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg lllSO 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 2 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasmid 
<400> 2 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 3 00 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 3 60 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaaoaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 12 00 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 132 0 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 13 80 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aacGccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 
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agcctcacag gccgggacag gaaccaggtc gaggggga'gg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtaaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 22 80 

acccgagggg, ttgcgaaggc ggtggacttt gtacccgtcg agtctatgga aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2820 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3 060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg .taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 3300 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 33 60 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3 600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcggctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtccttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacagggaat gcagctcgcc 3900 

gaacaattca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg tggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 43'20 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 462 0 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5 040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 522 0 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 
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acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 612 0 

ctgcggnaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacafctcg gccagatcta aatttggcta tggggcaaag 63 00 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgccccGC ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 732 0 

atctacgggg cctgttactc cattgagcca cttgacatac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 75 00 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 79B0 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 822 0 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 882 0 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttcogg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc ■ gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 942 0 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 
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cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 
gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 
tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 1122 0 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 112 80 

tagatccttt tctagataat acgactcact ata 11313 



<210> 3 
<211> 11313 
<212> DNA 

<213> Artificial Seqiience 
<220> 

<223> Plasmid 
<400> 3 



gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 12 0 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag ISO 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 3 00 

gtgcttgcga gtgccccggg aggtctcgta gaccgfcgcac catgagcacg aatcctaaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 132 0 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 13 80 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 
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atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatgga aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggn tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 25 80 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2 82 0 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2 880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagfctctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag -ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3 960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgfcaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 522 0 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 52 80 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 
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tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

- aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 72 00 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 732 0 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 762 0 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 768 0 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt ttttttttfct tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct. ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 852 0 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 93 00 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 93 60 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 942 0 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 
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gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtafcca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 1092 0 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcfctcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 4 
<211> 11313 
<212> DUA 

<213> Artificial Sequence 



<400> 4 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcataaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc '7Rfi 
cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 
ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 
ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 12 00 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 



780 
840 
900 
160 



wo 2004/055216 



PCT/US2003/039722 



aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagcgcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgttt cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggo aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 282 0 

gctctgtcca'gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 3300 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca 'aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 402 0 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 
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tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc .ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 618 0 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 63 00 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 678 0 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggacGC aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 732 0 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttccttttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 912 0 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 
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ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg . tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcafcgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 1002 0 

aagccagaca fctaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 10380 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10 8 SO 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatoct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 5 
<211> 11313 
<212> DMA 

<213> Artificial Sequence 



<400> 5 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccGcctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcafccg agcgagcacg tactcggatg gaagccggfcc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggfcg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgbggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 



60 
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ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 13 BO 

aggggtcttt cccctctcgc caaaggaatg caaggbctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgfcatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 282 0 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3 060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 312 0 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 342 0 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 378 0 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac ^440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 492 0 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 
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gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 582 0 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac cttfcgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct afcccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 63 00 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7520 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttbtt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 88 80 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 
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aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 93 60 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtac tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgfcgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 102 00 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 1032 0 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tfctctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 6 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasmid 

<400> 6 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 18 0 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 102 0 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 
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agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 12 00 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 13 80 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1S20 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagcgcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgttt cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acGcgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 252 0 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataafcatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2820. 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 312 0 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgfcctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgfctca aaacgaggtt acfcaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatafcttt ggcaggtfcat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 45 00 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 
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gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 51S0 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggag ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccggtacca 5700 

cctccacgga ggaagaggac ggttgtcctg ccagaatcta ccgtgtcttc cgccctggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctgactagcc ctccaacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcccccatgc ccccccttga gggggagccg ggggatcccg attccagcga cgggtcttgg 5940 

tctaccgtaa gcgagggggt tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 642 0 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7 020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcaa cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

.agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8 040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8 820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca tfcatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 
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atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta . 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 93 00 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 10380 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctca^cgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10 860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 1092 0 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 7 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Plasmid 
<400> 7 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 3 00 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 560 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 72 0 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 
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gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 102 0 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

Gcgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1550 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 192 0 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagcgcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgttt cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2450 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2 82 0 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 288 0 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacfca gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3 060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 318 0 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 3300 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 342 0 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 348 0 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggchtgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatccfc ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4520 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4580 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 
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accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 522 0 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctagccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggactct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggag ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccggtacca 5700 

cctccacgga ggaagaggac ggttgtcctg ccagaatcta ccgtgtcttc cgccctggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 582 0 

acggcctctc ctgactagcc ctccaacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcccccatgc ccccccttga gggggagccg ggggatcccg attccagcga cgggtcttgg 5940 

tctaccgtaa gcgagggggt tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgfcca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7 020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cacaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tfctccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 822 0 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtccfcg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 
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atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 900 0 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 912 0 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg fcctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 93 60 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 1002 0 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 102 00 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 1032 0 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 8 0 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 1062 0 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

dacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 8 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasmid 
<400> 8 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 
cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgcfc 480 
ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 
acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 
cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 
tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 
aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 
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cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatcfcg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 12 60 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

Gcgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgaoagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2 64 0 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2 760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2 82 0 

gctctgtcca gcactggaga aatccccfctt tatggcaaag ccatccccat cgagaccatc 28 80 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggaqg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3 660 

atggcatgca tgtcggctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 372 0 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 37 80 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtccttta ccgggagttc 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacagggaat gcagctcgcc 3900 

gaacaattca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg tggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 42 00 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260- 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

-ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 462 0 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 



wo 2004/055216 



PCT/US2003/039722 



gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 480 0 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 492 0 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 522 0 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctagccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5450 

tcagaaaata aggtagtaat tttggactct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

. tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 582 0 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ocacaacttg gtctatgcta caacatctcg cagcgcaagc 612 0 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 63 60 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 702 0 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg , 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 732 0 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 73 80 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

Gtactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 
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ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 
cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 
caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 88 80 
atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 
actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 
gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9050 
atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 
caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 
acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 
aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 93 00 
cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 
gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 942 0 
ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 
tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 
cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9S00 
gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 
tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 
tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 
cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 
gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 
tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 102 00 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 108 0 0 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 112 80 

tagatccttt tctagataat acgactcact ata 11313 



<210> 9 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasmid 
<400> 9 

gccagccccc gattgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccG gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 
gcgagactgc tagccgagta gtgttgggtc 
gtgcttgcga gtgccccggg aggtctcgta 
ctcaaagaaa aaccaaaggg cgcgccatga 
cggccgcttg ggtggagagg ctattcggct 
ctgatgccgc cgtgttccgg ctgtcagcgc 
acctgtccgg tgccctgaat gaactgcagg 



catagatcac tcccctgtga 
ttagtatgag tgtcgtgcag 
cggaaccggt gagtacaccg 
ctcaatgcct ggagatttgg 
gcgaaaggcc ttgtggtact 
gaccgtgcac catgagcacg 
ttgaacaaga tggattgcac 
atgactgggc acaacagaca 
aggggcgccc ggttcttttt 
acgaggcagc gcggctatcg 



ggaactactg 60 

cctccaggac 120 

gaattgccag 180 

gcgtgccccc 240 

gcctgatagg 3 00 

aatcctaaac 3 60 

gcaggttctc 420 
atcggctgct ' 480 

gtcaagaccg 540 

tggctggcca 600 
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cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc SSO 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 132 0 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 222 0 

ggcggtccac tgctctgccc cttfggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggfcgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 252 0 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 270 0 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2820 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 3300 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 342 0 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3 600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcggctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtccttta ccgggagttc 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacagggaat gcagctcgcc 3 900 

gaacaattca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg tggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 42 60 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 
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cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattecccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattcfct cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaacccfc ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 612 0 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 63 00 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggalcttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 642 0 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6B40 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 690 o 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 702 0 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 708 0 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 73 80 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggfcggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcGcgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctCCtttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 792 0 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 82 80 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 8520 
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gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact gttgggcgcc atctccttgc.atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag fccagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 1052 0 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 1092 0 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 11280 

tagatccttt tctagataat acgactcact ata 11313 



<210> 10 
<211> 11313 
<212> DHA 

<213> Artificial Secpience 
<220> 

<223> Plasmid 
<400> 10 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 12 0 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 
gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 
gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 
ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 
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cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgtfccg '900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 950 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 13 80 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2 040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 22 80 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2 820 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 3300 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3 660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 372 0 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 42 00 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 
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atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc S160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctagccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggactct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagagatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 582 0 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 6120 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 6420 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 6780 
actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg ' 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6950 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcafcttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 73 80 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 7620 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt' ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca tcgataagct ttaatgcggt 8100 

agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 8220 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcactatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 
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tccgaccgct ttggccgccg cccagbcctg ctcgcttcgc tacttggagc cactatcgac 8400 
tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8450 
ggcatcaccg gcgccacagg tgcggttgct ggcgcctata' tcgccgacat caccgatggg 8520 
gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 
ggccccgtgg ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8S40 
gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 
ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 
cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 
caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 
atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 
actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 
gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 
atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 9120 
caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 
acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 
aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 93 00 
cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 
gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 
ccaacccttg gcagaacata fcccatcgcgt ccgccatctc. cagcagccgc acgcggcgca 9480 
tctcgggcag cgttgggtcc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 
cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 
gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 
tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 
tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 
cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca' 9840 
gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 
tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcah caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 10020 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 102 00 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 10380 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 10800 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 10920 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 112 80 

tagatccttt tctagataat acgactcact ata 11313 



<210> 11 
<211> 11184 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasmid 
<400> 11 

gccagccccc gattgggggc gacactccac 
tcttcacgca gaaagcgtct agccatggcg 
cccccctccc gggagagcca tagtggtctg 
gacgaccggg tcctttcttg gatcaacccg 



catagatcac tcccctgtga ggaactactg 60 

ttagtatgag tgtcgtgcag cctccaggac 12 0 

cggaaccggt gagtacaccg gaattgccag 180 

ctcaatgcct ggagatfctgg gcgtgccccc 240 
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gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 

gtgcttgcga . gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 3 60 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tggatStggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatafctg 1320 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggotg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2 040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtcfc cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatgga aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atccfcgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 282 0 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2 880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3 060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 342 0 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcggctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 378 0 

atcatcttgfc ccggaaagcc ggccatcatt cccgacaggg aagtccttta ccgggagttc 3 840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacagggaat gcagctcgcc 3 900 

gaacaattca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3 950 

gctgctcccg tggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 
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gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 52 80 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cactcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccggtacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctggtgagga cgtcgfcctgc tgctcgatgt cctacacatg gacaggcgcc 5880 

ctgatcacgc catgcgctgc ggaggaaacc aagctgccca tcaatgcact gagcaactct 5940 

ttgctccgac accacaactt ggtctatgct acaacatctc gcagcgcaag cctgcggcag 6000 

aagaaggtca cctttgacag actgcaggtc ctggacgacc actaccggga cgtgctcaag 6060 

gagatgaagg cgaaggcgtc cacagttaag gctaaacttc tatccgtgga ggaagcctgt 6120 

aagctgacgc ccccacattc ggccagatct aaatttggct atggggcaaa ggacgtccgg 6180 

aacctatcca gcaaggccgt taaccacatc cgctccgtgt ggaaggactt gctggaagac 6240 

actgagacac caattgacac caccatcatg gcaaaaaatg aggttttctg cgtccaacca 6300 

gagaaggggg gccgcaagcc agctcgcctt atcgtattcc cagatttggg ggttcgtgtg 6360 

tgcgagaaaa tggcccttta cgatgtggtc tccaccctcc ctcaggccgt gatgggctct 642 0 

tcatacggat tccaatactc tcctggacag cgggtcgagt tcctggtgaa tgcctggaaa 6480 

gcgaagaaat gccctatggg cttcgcatat gacacccgct gttttgactc aacggtcact 6540 

gagaatgaca tccgtgttga ggagtcaatc taccaatgtt gtgacttggc ccccgaagcc 6600 

agacaggcca taaggtcgct cacagagcgg ctttacatcg ggggccccct gactaattct 6660 

aaagggcaga actgcggcta tcgccggtgc cgcgcgagcg gtgtactgac gaccagctgc 6720 

ggtaataccc tcacatgtta cttgaaggcc gctgcggcct gtcgagctgc gaagctccag 6780 

gactgcacga tgctcgtatg cggagacgac cttgtcgtta tctgtgaaag cgcggggacc 6840 

caagaggacg aggcgagcct acgggccttc acggaggcta tgactagata ctctgccccc 6900 

cctggggacc cgcccaaacc agaatacgac ttggagttga taacatcatg ctcctccaat 6960 

gtgtcagtcg cgcacgatgc atctggcaaa agggtgtact atctcacccg tgaccccacc 7020 

accccccttg cgcgggctgc gtgggagaca gctagacaca ctccagtcaa ttcctggcta 7080 

ggcaacatca tcatgtatgc gcccaccttg tgggcaagga tgatcctgat gactcatttc 7140 

ttctccatcc ttctagctca ggaacaactt gaaaaagccc tagattgtca gatctacggg 72 00 

gcctgttact ccattgagcc acttgaccta cctcagatca ttcaacgact ccatggcctt 7260 

agcgcatttt cactccatag ttactctcca ggtgagatca atagggtggc ttcatgcctc 732 0 

aggaaacttg gggtaccgcc cttgcgagtc tggagacatc gggccagaag tgtccgcgct 73 80 

aggctactgt cccagggggg gagggctgcc acttgtggca agtacctctt caactgggca 7440 

gtaaggacca agctcaaact cactccaatc ccggctgcgt cccagttgga tttatccagc 7500 

tggttcgttg ctggttacag cgggggagac atatatcaca gcctgtctcg tgcccgaccc 7560 

cgctggttca tgtggtgcct actcctactt tctgtagggg taggcatcta tctactcccc 7620 

aaccgatgaa cggggaccta aacactccag gccaataggc catcctgttt ttttcccttt 7 680 

ttttttttct tttttttttt tttttttttt tttttttttt tttctccttt ttttttcctc 7740 

tttttttcct tttctttcct ttggtggctc catcttagcc ctagtcacgg ctagctgtga 78 00 

aaggtccgtg agccgcttga ctgcagagag tgctgatact ggcctctctg cagatcaagt 7860 

actcctgcag gcgcgccact agtgggaata cgcggggtat gccgcgtttt agcatattga 7920 

cgacccaatt ctcatgtttg acagcttatc atcgataagc tttaatgcgg tagtttatca 7980 

cagttaaatt gctaacgcag tcaggcaccg tgtatgaaat ctaacaatgc gctcatcgtc 8 040 

atcctcggca ccgtcaccct ggatgctgta ggcataggct tggttatgcc ggtactgccg 8100 

ggcctctfcgc gggatatcgt ccattccgac agcatcgcca gtcactatgg cgtgctgcta 8160 
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gcgctatatg cgttgatgca atttctatgc gcacccgttc tcggagcact gtccgaccgc 8220 

tttggccgcc gcccagtcct gctcgcttcg ctacttggag ccactatcga ctacgcgatc 8280 

atggcgacca cacccgtcct gtggatcctc tacgccggac gcatcgtggc cggcatcacc 8340 

ggcgccacag gtgcggttgc tggcgcctat atcgccgaca tcaccgatgg ggaagatcgg 8400 

gctcgccact tcgggctcat gagcgcttgt ttcggcgtgg gtatggtggc aggccccgtg 8460 

gccgggggac tgttgggcgc catctccttg catgcaccat tccttgcggc ggcggtgctc 8520 

aacggcctca acctactact gggctgcttc ctaatgcagg agtcgcataa gggagagcgt 8580 

cgaccgatgc ccttgagagc cttcaaccca gtcagctcct tccggtgggc gcggggcatg 8640 

actatcgtcg cngcacttat gactgtcttc tttatcatgc aactcgtagg acaggtgccg 8700 

gcagcgctct gggtcatttt cggcgaggac cgctttcgct ggagcgcgac gatgatcggc 8760 

ctgtcgcttg cggtattcgg aatcttgcac gccctcgctc aagccttcgt cactggtccc 882 0 

gccaccaaac gtttcggcga gaagcaggcc attatcgccg gcatggcggc cgacgcgctg 8880 

ggctacgtct tgctggcgtt cgcgacgcga ggctggatgg ccttccccat tatgattctt 8940 

ctcgcttccg gcggcatcgg gatgcccgcg ttgcaggcca tgctgtccag gcaggtagat 9000 

gacgaccatc agggacagct tcaaggatcg ctcgcggctc ttaccagcct aacttcgatc 9060 

actggaccgc tgatcgtcac ggcgatttat gccgcctcgg cgagcacatg gaacgggttg 9120 

gcatggattg taggcgccgc cctatacctt gtctgcctcc ccgcgttgcg tcgcggtgca 9180 

tggagccggg ccacctcgac ctgaatggaa gccggcggca cctcgctaac ggattcacca 9240 

ctccaagaat tggagccaat caattcttgc ggagaactgt gaatgcgcaa accaaccctt 9300 

ggcagaacat atccatcgcg tccgccatct ccagcagccg cacgcggcgc atctcgggca 9360 

gcgttgggtc ctggccacgg gtgcgcatga tcgtgctcct gtcgttgagg acccggctag 9420 

gctggcgggg ttgccttact ggttagcaga atgaatcacc gatacgcgag cgaacgtgaa 9480 

gcgactgctg ctgcaaaacg tctgcgacct gagcaacaac atgaatggtc ttcggtttcc 9540 

gtgtttcgta aagtctggaa acgcggaagt cagcgccctg caccattatg ttccggatct 9600 

gcatcgcagg atgctgctgg ctaccctgtg gaacacctac atctgtatta acgaagcgct 9660 

ggcattgacc ctgagtgatt tttctctggt cccgccgcat ccataccgcc agttgtttac 972 0 

cctcacaacg ttccagtaac cgggcatgtt catcatcagt aacccgtatc gtgagcatcc 9780 

tctctcgttt catcggtatc attaccccca tgaacagaaa ttccccctta cacggaggca 9840 

tcaagtgacc aaacaggaaa aaaccgccct taacatggcc cgctttatca gaagccagac 9900 

attaacgctt ctggagaaac tcaacgagct ggacgcggat gaacaggcag acatctgtga 9960 

atcgcttcac gaccacgctg atgagcttta ccgcagctgc ctcgcgcgtt tcggtgatga 10020 

cggtgaaaac ctctgacaca tgcagctccc ggagacggtc acagcttgtc tgtaagcgga 10080 

tgccgggagc agacaagccc gtcagggcgc gtcagcgggt gttggcgggt gtcggggcgc 10140 

agccatgacc cagtcacgta gcgatagcgg agtgtatact ggcttaacta tgcggcatca 10200 

gagcagattg tactgagagt gcaccatatg cggtgtgaaa taccgcacag atgcgtaagg 102S0 

agaaaatacc gcatcaggcg ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 10320 

gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 10380 

tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 10440 

aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 10500 

aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 10560 

ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 10620 

tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 10680 

agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 10740 

gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 10800 

tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 10860 

acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt atttggtatc 10920 

tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 10980 

caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 11040 

aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 11100 

aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 11160 

ttctagataa tacgactcac tata 11184 



<210> 12 
<211> 11313 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Plasraid 
<400> 12 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 
tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 
cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 
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gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 3 60 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctcfcgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 1920 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 210 0 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatgga aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2820 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgfcgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 4080 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 
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accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 4200 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 42 SO 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatnctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 492 0 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatcfcccc ccctccttgg ccagctcatc agctagccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggactct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cacgcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccgatacca 5700. 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctgaccagcc ctccgacgac ggcgacgcgg gatccgacgt tgagtcgtac 5880 

tcctccatgc ccccccttga gggggagccg ggggatcccg atctcagcga cgggtcttgg 5940 

tctaccgtaa gcgaggaggc tagtgaggac gtcgtctgct gctcgatgtc ctacacatgg 6000 

acaggcgccc tgatcacgcc atgcgctgcg gaggaaacca agctgcccat caatgcactg 6060 

agcaactctt tgctccgtca ccacaacttg gtctatgcta caacatctcg cagcgcaagc 612 0 

ctgcggcaga agaaggtcac ctttgacaga ctgcaggtcc tggacgacca ctaccgggac 6180 

gtgctcaagg agatgaaggc gaaggcgtcc acagttaagg ctaaacttct atccgtggag 6240 

gaagcctgta agctgacgcc cccacattcg gccagatcta aatttggcta tggggcaaag 6300 

gacgtccgga acctatccag caaggccgtt aaccacatcc gctccgtgtg gaaggacttg 6360 

ctggaagaca ctgagacacc aattgacacc accatcatgg caaaaaatga ggttttctgc 642 0 

gtccaaccag agaagggggg ccgcaagcca gctcgcctta tcgtattccc agatttgggg 6480 

gttcgtgtgt gcgagaaaat ggccctttac gatgtggtct ccaccctccc tcaggccgtg 6540 

atgggctctt catacggatt ccaatactct cctggacagc gggtcgagtt cctggtgaat 6600 

gcctggaaag cgaagaaatg ccctatgggc ttcgcatatg acacccgctg ttttgactca 6660 

acggtcactg agaatgacat ccgtgttgag gagtcaatct accaatgttg tgacttggcc 6720 

cccgaagcca gacaggccat aaggtcgctc acagagcggc tttacatcgg gggccccctg 5780 

actaattcta aagggcagaa ctgcggctat cgccggtgcc gcgcgagcgg tgtactgacg 6840 

accagctgcg gtaataccct cacatgttac ttgaaggccg ctgcggcctg tcgagctgcg 6900 

aagctccagg actgcacgat gctcgtatgc ggagacgacc ttgtcgttat ctgtgaaagc 6960 

gcggggaccc aagaggacga ggcgagccta cgggccttca cggaggctat gactagatac 7020 

tctgcccccc ctggggaccc gcccaaacca gaatacgact tggagttgat aacatcatgc 7080 

tcctccaatg tgtcagtcgc gcacgatgca tctggcaaaa gggtgtacta tctcacccgt 7140 

gaccccacca ccccccttgc gcgggctgcg tgggagacag ctagacacac tccagtcaat 7200 

tcctggctag gcaacatcat catgtatgcg cccaccttgt gggcaaggat gatcctgatg 7260 

actcatttct tctccatcct tctagctcag gaacaacttg aaaaagccct agattgtcag 7320 

atctacgggg cctgttactc cattgagcca cttgacctac ctcagatcat tcaacgactc 7380 

catggcctta gcgcattttc actccatagt tactctccag gtgagatcaa tagggtggct 7440 

tcatgcctca ggaaacttgg ggtaccgccc ttgcgagtct ggagacatcg ggccagaagt 7500 

gtccgcgcta ggctactgtc ccaggggggg agggctgcca cttgtggcaa gtacctcttc 7560 

aactgggcag taaggaccaa gctcaaactc actccaatcc cggctgcgtc ccagttggat 762 0 

ttatccagct ggttcgttgc tggttacagc gggggagaca tatatcacag cctgtctcgt 7680 

gcccgacccc gctggttcat gtggtgccta ctcctacttt ctgtaggggt aggcatctat 7740 

ctactcccca accgatgaac ggggacctaa acactccagg ccaataggcc atcctgtttt 7800 

tttccctttt tttttttctt tttttttttt tttttttttt tttttttttt ttctcctttt 7860 

tttttcctct ttttttcctt ttctttcctt tggtggctcc atcttagccc tagtcacggc 7920 

tagctgtgaa aggtccgtga gccgcttgac tgcagagagt gctgatactg gcctctctgc 7980 

agatcaagta ctcctgcagg cgcgccacta gtgggaatac gcggggtatg ccgcgtttta 8040 

gcatattgac gacccaattc tcatgtttga cagcttatca togataagct ttaatgcggt 8100 
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agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc taacaatgcg 8160 

ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt ggttatgccg 822 0 

gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag tcaatatggc 8280 

gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct cggagcactg 8340 

tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc cactatcgac 8400 

tacgcgatca tggcgaccac acccgtcctg tggatcctct acgccggacg catcgtggcc 8460 

ggcatcaccg gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg 852 0 

gaagatcggg ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca 8580 

ggccccgtgg ccgggggact. gttgggcgcc atctccttgc atgcaccatt ccttgcggcg 8640 

gcggtgctca acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag 8700 

ggagagcgtc gaccgatgcc cttgagagcc ttcaacccag tcagctcctt ccggtgggcg 8760 

cggggcatga ctatcgtcgc cgcacttatg actgtcttct ttatcatgca actcgtagga 8820 

caggtgccgg cagcgctctg ggtcattttc ggcgaggacc gctttcgctg gagcgcgacg 8880 

atgatcggcc tgtcgcttgc ggtattcgga atcttgcacg ccctcgctca agccttcgtc 8940 

actggtcccg ccaccaaacg tttcggcgag aagcaggcca ttatcgccgg catggcggcc 9000 

gacgcgctgg gctacgtctt gctggcgttc gcgacgcgag gctggatggc cttccccatt 9060 

atgattcttc tcgcttccgg cggcatcggg atgcccgcgt tgcaggccat gctgtccagg 912 0 

caggtagatg acgaccatca gggacagctt caaggatcgc tcgcggctct taccagccta 9180 

acttcgatca ctggaccgct gatcgtcacg gcgatttatg ccgcctcggc gagcacatgg 9240 

aacgggttgg catggattgt aggcgccgcc ctataccttg tctgcctccc cgcgttgcgt 9300 

cgcggtgcat ggagccgggc cacctcgacc tgaatggaag ccggcggcac ctcgctaacg 9360 

gattcaccac tccaagaatt ggagccaatc aattcttgcg gagaactgtg aatgcgcaaa 9420 

ccaacccttg gcagaacata tccatcgcgt ccgccatctc cagcagccgc acgcggcgca 9480 

tctcgggcag cgttgggfccc tggccacggg tgcgcatgat cgtgctcctg tcgttgagga 9540 

cccggctagg ctggcggggt tgccttactg gttagcagaa tgaatcaccg atacgcgagc 9600 

gaacgtgaag cgactgctgc tgcaaaacgt ctgcgacctg agcaacaaca tgaatggtct 9660 

tcggtttccg tgtttcgtaa agtctggaaa cgcggaagtc agcgccctgc accattatgt 9720 

tccggatctg catcgcagga tgctgctggc taccctgtgg aacacctaca tctgtattaa 9780 

cgaagcgctg gcattgaccc tgagtgattt ttctctggtc ccgccgcatc cataccgcca 9840 

gttgtttacc ctcacaacgt tccagtaacc gggcatgttc atcatcagta acccgtatcg 9900 

tgagcatcct ctctcgtttc atcggtatca ttacccccat gaacagaaat tcccccttac 9960 

acggaggcat caagtgacca aacaggaaaa aaccgccctt aacatggccc gctttatcag 1002 0 

aagccagaca ttaacgcttc tggagaaact caacgagctg gacgcggatg aacaggcaga 10080 

catctgtgaa tcgcttcacg accacgctga tgagctttac cgcagctgcc tcgcgcgttt 10140 

cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct 10200 

gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg 10260 

tcggggcgca gccatgaccc agtcacgtag cgatagcgga gtgtatactg gcttaactat 10320 

gcggcatcag agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga 103 80 

tgcgtaagga gaaaataccg catcaggcgc tcttccgctt cctcgctcac tgactcgctg 10440 

cgctcggtcg ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta 10500 

tccacagaat caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc 10560 

aggaaccgta aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag 10620 

catcacaaaa atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac 10680 

caggcgtttc cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc 10740 

ggatacctgt ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt 108 0 0 

aggtatctca gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc 10860 

gttcagcccg accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga 1092 0 

cacgacttat cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta 10980 

ggcggtgcta cagagttctt gaagtggtgg cctaactacg gctacactag aaggacagta 11040 

tttggtatct gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga 11100 

tccggcaaac aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg 11160 

cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 11220 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 112 80 

tagatccttt tctagataat acgactcact ata 11313 
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gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggt gagtacaccg gaattgccag 180 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 3 00 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcaa gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccat ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 1020 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg tgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

ccgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 1380 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 1620 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct ttgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 192 0 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagctcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgtct cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 22 80 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatgga aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 2520 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 282 0 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 3300 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 342 0 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

ctggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 
atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc ' 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 

gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 
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gctgctcccg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 40 BO 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 42 00 

gcttctgctt tcgtaggcgc cggcatcgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 43 80 

atcctctccc ctggcgccct agtcgtcggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgctgcagc acgtgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 492 0 

accacgggcc cctgcacgcc ctccccggcg ccaaatfcatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 5280 

gctaagcgta ggctggccag gggatctccc ccctccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cactcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggtaca cgggtgtcca ttgccgcctg ccaaggcccc tccggtacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 582 0 

acggcctctc ctggtgagga cgtcgtctgc tgctcgatgt cctacacatg gacaggcgcc 58 80 

ctgatcacgc catgcgctgc ggaggaaacc aagctgccca tcaatgcact gagcaactct 5940 

ttgctccgac accacaactt ggtctatgct acaacatctc gcagcgcaag cctgcggcag 6000 

aagaaggtca cctttgacag actgcaggtc ctggacgacc actaccggga cgtgctcaag 6060 

gagatgaagg cgaaggcgtc cacagttaag gctaaacttc tatccgtgga ggaagcctgt 6120 

aagctgacgc ccccacattc ggccagatct aaatttggct atggggcaaa ggacgtccgg 6180 

aacctatcca gcaaggccgt taaccacatc cgctccgtgt ggaaggactt gctggaagac 6240 

actgagacac caattgacac caccatcatg gcaaaaaatg aggttttctg cgtccaacca 63 0 0 

gagaaggggg gccgcaagcc agctcgcctt atcgtattcc cagatttggg ggttcgtgtg 63 60 

tgcgagaaaa tggcccttta cgatgtggtc tccaccctcc ctcaggccgt gatgggctct 6420 

tcatacggat tccaatactc tcctggacag cgggtcgagt tcctggtgaa tgcctggaaa 6480 

gcgaagaaat gccctatggg cttcgcatat gacacccgct gttttgactc aacggtcact 6540 

gagaatgaca tccgtgttga ggagtcaatc taccaatgtt gtgacttggc ccccgaagcc 6600 

agacaggcca taaggtcgct cacagagcgg ctttacatcg ggggccccct gactaattct 6660 

aaagggcaga actgcggcta tcgccggtgc cgcgcgagcg gtgtactgac gaccagctgc 672 0 

ggtaataccc tcacatgtta cttgaaggcc gctgcggcct gtcgagctgc gaagctccag 6780 

gactgcacga tgctcgtatg cggagacgac cttgtcgtta tctgtgaaag cgcggggacc 6840 

caagaggacg aggcgagcct acgggccttc acggaggcta tgactagata ctctgccccc 6900 

cctggggacc cgcccaaacc agaatacgac ttggagttga taacatcatg ctcctccaat 6960 

gtgtcagtcg cgcacgatgc atctggcaaa agggtgtact atctcacccg tgaccccacc 702 0 

accccccttg cgcgggctgc gtgggagaca gctagacaca ctccagtcaa ttcctggcta 7080 

ggcaacatca tcatgtatgc gcccaccttg tgggcaagga tgatcctgat gactcatttc 7140 

ttctccatcc ttctagctca ggaacaactt gaaaaagccc tagattgtca gatctacggg 72 00 

gcctgttact ccattgagcc acttgaccta cctcagatca ttcaacgact ccatggcctt 7260 

agcgcatttt cactccatag ttactctcca ggtgagatca atagggtggc ttcatgcctc 7320 

aggaaacttg gggtaccgcc cttgcgagtc tggagacatc gggccagaag tgtccgcgct 7380 

aggctactgt cccagggggg gagggctgcc acttgtggca agtacctctt caactgggca 7440 

gtaaggacca agctcaaact cactccaatc ccggcfcgcgt cccagttgga tttatccagc 7500 

tggttcgttg ctggttacag cgggggagac atatatcaca gcctgtctcg tgcccgaccc 7560 

cgctggttca tgtggtgcct actcctactt tctgtagggg taggcatcta tctactcccc 7620 

aaccgatgaa cggggaccta aacactccag gccaataggc catcctgttt ttttcccttt 7680 

ttttttttct tttttttttt tttttttttt tttttttttt tttctccttt ttttttcctc 7740 

tttttttcct tttctttcct ttggtggctc catcttagcc ctagtcacgg ctagctgtga 7800 

aaggtccgtg agccgcttga ctgcagagag tgctgatact ggcctctctg cagatcaagt 7860 

actcctgcag gcgcgccact agtgggaata cgcggggtat gccgcgtttt agcatattga 7920 
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cgacccaatt ctcatgtttg acagcttatc atcgataagc tttaatgcgg tagtttatca 7980 

cag.ttaaatt gctaacgcag tcaggcaccg tgtatgaaat ctaacaatgc gctcatcgtc 8040 

atcctcggca ccgtcaccct ggatgctgta ggcataggct tggttatgcc ggtactgccg 8100 

ggcctcttgc gggatatcgt ccattccgac agcatcgcca gtcactatgg cgtgctgcta 81S0 

gcgctatatg cgttgatgca atttctatgc gcacccgfctc tcggagcact gtccgaccgc 8220 

tttggccgcc gcccagtcct gctcgcttcg ctacttggag ccactatcga ctacgcgatc 8280 

atggcgacca cacccgtcct gtggatcctc tacgccggac gcatcgtggc cggcatcacc 8340 

ggcgccacag gtgcggttgc tggcgcctat atcgccgaca tcaccgatgg ggaagatcgg 8400 

gctcgccact tcgggctcat gagcgcttgt ttcggcgtgg gtatggtggc aggccccgtg 84S0 

gccgggggac tgttgggcgc catctccttg catgcaccat tccttgcggc ggcggtgctc 8520 

aacggcctca acctactact gggctgcttc ctaatgcagg agtcgcataa gggagagcgt 8580 

cgaccgatgc ccttgagagc cttcaaccca gtcagctcct tccggtgggc gcggggcatg 8640 

actatcgtcg ccgcacttat gactgtcttc tttatcatgc aactcgtagg acaggtgccg 8700 

gcagcgctct gggtcatttt cggcgaggac cgctttcgct ggagcgcgac gatgatcggc 8760 

ctgtcgcttg cggtattcgg aatcttgcac gccctcgctc aagccttcgt cactggtccc 8820 

gccaccaaac gtttcggcga gaagcaggcc attatcgccg gcatggcggc cgacgcgctg 8880 

ggctacgtct tgctggcgtt cgcgacgcga ggctggatgg ccttccccat tatgattctt 8940 

ctcgcttccg gcggcatcgg gatgcccgcg ttgcaggcca tgctgtccag gcaggtagat 9000 

gacgaccatc agggacagct tcaaggatcg ctcgcggctc ttaccagcct aacttcgatc 9050 

actggaccgc tgatcgtcac ggcgatttat gccgcctcgg cgagcacatg gaacgggttg 9120 

gcatggattg taggcgccgc cctatacctt gtctgcctcc ccgcgttgcg tcgcggtgca 9180 

tggagccggg ccacctcgac ctgaatggaa gccggcggca cctcgctaac ggattcacca 9240 

ctccaagaat tggagccaat caattcttgc ggagaactgt gaatgcgcaa accaaccctt 93 00 

ggcagaacat atccatcgcg tccgccatct ccagcagccg cacgcggcgc atctcgggca 9360 

gcgttgggtc ctggccacgg gtgcgcatga tcgtgctcct gtcgttgagg acccggctag 9420 

gctggcgggg ttgccttact ggttagcaga atgaatcacc gatacgcgag cgaacgtgaa 9480 

gcgactgctg ctgcaaaacg tctgcgacct gagcaacaac atgaatggtc ttcggtttcc 9540 

gtgtttcgta aagtdtggaa acgcggaagt cagcgccctg caccattatg ttccggatct 9600 

gcatcgcagg atgctgctgg ctaccctgtg gaacacctac atctgtatta acgaagcgct 9660 

ggcattgacc ctgagtgatt tttctctggt cccgccgcat ccataccgcc agttgtttac 972 0 

cctcacaacg ttccagtaac cgggcatgtt catcatcagt aacccgtatc gtgagcatcc 9780 

tctctcgttt catcggtatc attaccccca tgaacagaaa ttccccctta cacggaggca 9840 

tcaagtgacc aaacaggaaa aaaccgccct taacatggcc cgctttatca gaagccagac 9900 

attaacgctt ctggagaaac tcaacgagct ggacgcggat gaacaggcag acatctgtga 9960 

atcgcttcac gaccacgctg atgagcttta ccgcagctgc ctcgcgcgtt tcggtgatga 10020 

cggtgaaaac ctctgacaca tgcagctccc ggagacggtc acagcttgtc tgtaagcgga 10080 

tgccgggagc agacaagccc gtcagggcgc gtcagcgggt gttggcgggt gtcggggcgc 10140 

agccatgacc cagtcacgta gcgatagcgg agtgtatact ggcttaacta tgcggcatca 102 00 

gagcagattg tacfcgagagt gcaccatatg cggtgtgaaa taccgcacag atgcgtaagg 10260 

agaaaatacc gcatcaggcg ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 1032 0 

gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 10380 

tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 10440 

aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 10500 

aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 10560 

ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 1062 Oi 

tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 10680 

agttcggtgt aggfccgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 10740 

gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 10800 

tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 10860 

acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt atttggtatc 1092 0 

tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 10980 

caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 11040 

aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 11100 

aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 11160 

ttctagataa tacgactcac tata 11184 



<210> 14 
<211> 11184 
<212> DNA 

<213> Artificial Secpience 
<220> 
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<400> 14 

gccagccccc gattgggggc gacactccac catagatcac tcccctgtga ggaactactg 60 

tcttcacgca gaaagcgtct agccatggcg ttagtatgag tgtcgtgcag cctccaggac 120 

cccccctccc gggagagcca tagtggtctg cggaaccggfc gagtacaccg gaattgccag 180 

gacgaccggg tcctttcttg gatcaacccg ctcaatgcct ggagatttgg gcgtgccccc 240 

gcgagactgc tagccgagta gtgttgggtc gcgaaaggcc ttgtggtact gcctgatagg 300 

gtgcttgcga gtgccccggg aggtctcgta gaccgtgcac catgagcacg aatcctaaac 360 

ctcaaagaaa aaccaaaggg cgcgccatga ttgaacaaga tggattgcac gcaggttctc 420 

cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca atcggctgct 480 

ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt gtcaagaccg 540 

acctgtccgg tgccctgaat gaactgcagg acgaggcagc gcggctatcg tggctggcca 600 

cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga agggactggc 660 

tgctattggg cgaagtgccg gggcaggatc tcctgtcatc tcaccttgct cctgccgaga 720 

aagtatccat catggctgat gcaatgcggc ggctgcatac gcttgatccg gctacctgcc 780 

cattcgacca ccaagcgaaa catcgcatcg agcgagcacg tactcggatg gaagccggtc 840 

ttgtcgatca ggatgatctg gacgaagagc atcaggggct cgcgccagcc gaactgttcg 900 

ccaggctcaa ggcgcgcatg cccgacggcg aggatctcgt cgtgacccafc ggcgatgcct 960 

gcttgccgaa tatcatggtg gaaaatggcc gcttttctgg attcatcgac tgtggccggc 102 0 

tgggtgtggc ggaccgctat caggacatag cgttggctac ccgtgatatt gctgaagagc 1080 

ttggcggcga atgggctgac cgcttcctcg fcgctttacgg tatcgccgct cccgattcgc 1140 

agcgcatcgc cttctatcgc cttcttgacg agttcttctg agtttaaaca gaccacaacg 1200 

gtttccctct agcgggatca attccgcccc tctccctccc ccccccctaa cgttactggc 1260 

cgaagccgct tggaataagg ccggtgtgcg tttgtctata tgttattttc caccatattg 1320 

Gcgtcttttg gcaatgtgag ggcccggaaa cctggccctg tcttcttgac gagcattcct 13 80 

aggggtcttt cccctctcgc caaaggaatg caaggtctgt tgaatgtcgt gaaggaagca 1440 

gttcctctgg aagcttcttg aagacaaaca acgtctgtag cgaccctttg caggcagcgg 1500 

aaccccccac ctggcgacag gtgcctctgc ggccaaaagc cacgtgtata agatacacct 1560 

gcaaaggcgg cacaacccca gtgccacgtt gtgagttgga tagttgtgga aagagtcaaa 162 0 

tggctctcct caagcgtatt caacaagggg ctgaaggatg cccagaaggt accccattgt 1680 

atgggatctg atctggggcc tcggtgcaca tgctttacat gtgtttagtc gaggttaaaa 1740 

aacgtctagg ccccccgaac cacggggacg tggttttcct fctgaaaaaca cgataatacc 1800 

atggcgccta ttacggccta ctcccaacag acgcgaggcc tacttggctg catcatcact 1860 

agcctcacag gccgggacag gaaccaggtc gagggggagg tccaagtggt ctccaccgca 192 0 

acacaatctt tcctggcgac ctgcgtcaat ggcgtgtgtt ggactgtcta tcatggtgcc 1980 

ggctcaaaga cccttgccgg cccaaagggc ccaatcaccc aaatgtacac caatgtggac 2040 

caggacctcg tcggctggca agcgcccccc ggggcgcgtt ccttgacacc atgcacctgc 2100 

ggcagcgcgg acctttactt ggtcacgagg catgccgatg tcattccggt gcgccggcgg 2160 

ggcgacagca gggggagcct actctccccc aggcccgttt cctacttgaa gggctcttcg 2220 

ggcggtccac tgctctgccc ctcggggcac gctgtgggca tctttcgggc tgccgtgtgc 2280 

acccgagggg ttgcgaaggc ggtggacttt gtacccgtcg agtctatggg aaccactatg 2340 

cggtccccgg tcttcacgga caactcgtcc cctccggccg taccgcagac attccaggtg 2400 

gcccatctac acgcccctac tggtagcggc aagagcacta aggtgccggc tgcgtatgca 2460 

gcccaagggt ataaggtgct tgtcctgaac ccgtccgtcg ccgccaccct aggtttcggg 252 0 

gcgtatatgt ctaaggcaca tggtatcgac cctaacatca gaaccggggt aaggaccatc 2580 

accacgggtg cccccatcac gtactccacc tatggcaagt ttcttgccga cggtggttgc 2640 

tctgggggcg cctatgacat cataatatgt gatgagtgcc actcaactga ctcgaccact 2700 

atcctgggca tcggcacagt cctggaccaa gcggagacgg ctggagcgcg actcgtcgtg 2760 

ctcgccaccg ctacgcctcc gggatcggtc accgtgccac atccaaacat cgaggaggtg 2820 

gctctgtcca gcactggaga aatccccttt tatggcaaag ccatccccat cgagaccatc 2 880 

aaggggggga ggcacctcat tttctgccat tccaagaaga aatgtgatga gctcgccgcg 2940 

aagctgtccg gcctcggact caatgctgta gcatattacc ggggccttga tgtatccgtc 3 000 

ataccaacta gcggagacgt cattgtcgta gcaacggacg ctctaatgac gggctttacc 3060 

ggcgatttcg actcagtgat cgactgcaat acatgtgtca cccagacagt cgacttcagc 3120 

ctggacccga ccttcaccat tgagacgacg accgtgccac aagacgcggt gtcacgctcg 3180 

cagcggcgag gcaggactgg taggggcagg atgggcattt acaggtttgt gactccagga 3240 

gaacggccct cgggcatgtt cgattcctcg gttctgtgcg agtgctatga cgcgggctgt 33 00 

gcttggtacg agctcacgcc cgccgagacc tcagttaggt tgcgggctta cctaaacaca 3360 

ccagggttgc ccgtctgcca ggaccatctg gagttctggg agagcgtctt tacaggcctc 3420 

acccacatag acgcccattt cttgtcccag actaagcagg caggagacaa cttcccctac 3480 

Gtggtagcat accaggctac ggtgtgcgcc agggctcagg ctccacctcc atcgtgggac 3540 

caaatgtgga agtgtctcat acggctaaag cctacgctgc acgggccaac gcccctgctg 3 600 

tataggctgg gagccgttca aaacgaggtt actaccacac accccataac caaatacatc 3660 

atggcatgca tgtcagctga cctggaggtc gtcacgagca cctgggtgct ggtaggcgga 3 720 

gtcctagcag ctctggccgc gtattgcctg acaacaggca gcgtggtcat tgtgggcagg 3780 

atcatcttgt ccggaaagcc ggccatcatt cccgacaggg aagtctttta ccgggagttc 3840 

gatgagatgg aagagtgcgc ctcacacctc ccttacatcg aacggggaat gcagctcgcc 3900 



wo 2004/055216 



PCT/US2003/039722 



gaacatttca aacagaaggc aatcgggttg ctgcaaacag ccaccaagca agcggaggct 3960 

gctgctcncg cggtggaatc caagtggcgg accctcgaag ccttctgggc gaagcatatg 4020 

tggaatttca tcagcgggat acaatattta gcaggcttgt ccactctgcc tggcaacccc 40 BO 

gcgatagcat cactgatggc attcacagcc tctatcacca gcccgctcac cacccaacat 4140 

accctcctgt ttaacatcct ggggggatgg gtggccgccc aacttgctcc tcccagcgct 42 00 

gcttctgctt tcgtaggcgc cggcafccgct ggagcggctg ttggcagcat aggccttggg 4260 

aaggtgcttg tggatatttt ggcaggttat ggagcagggg. tggcaggcgc gctcgtggcc 4320 

tttaaggtca tgagcggcga gatgccctcc accgaggacc tggttaacct actccctgct 4380 

atcctctccc ctggcgccct agtcgfccggg gtcgtgtgcg cagcgatact gcgtcggcac 4440 

gtgggcccag gggagggggc tgtgcagtgg atgaaccggc tgatagcgtt cgcttcgcgg 4500 

ggtaaccacg tctcccccac gcactatgtg cctgagagcg acgcfcgcagc acgfcgtcact 4560 

cagatcctct ctagtcttac catcactcag ctgctgaaga ggcttcacca gtggatcaac 4620 

gaggactgct ccacgccatg ctccggctcg tggctaagag atgtttggga ttggatatgc 4680 

acggtgttga ctgatttcaa gacctggctc cagtccaagc tcctgccgcg attgccggga 4740 

gtccccttct tctcatgtca acgtgggtac aagggagtct ggcggggcga cggcatcatg 4800 

caaaccacct gcccatgtgg agcacagatc accggacatg tgaaaaacgg ttccatgagg 4860 

atcgtggggc ctaggacctg tagtaacacg tggcatggaa cattccccat taacgcgtac 4920 

accacgggcc cctgcacgcc ctccccggcg ccaaattatt ctagggcgct gtggcgggtg 4980 

gctgctgagg agtacgtgga ggttacgcgg gtgggggatt tccactacgt gacgggcatg 5040 

accactgaca acgtaaagtg cccgtgtcag gttccggccc ccgaattctt cacagaagtg 5100 

gatggggtgc ggttgcacag gtacgctcca gcgtgcaaac ccctcctacg ggaggaggtc 5160 

acattcctgg tcgggctcaa tcaatacctg gttgggtcac agctcccatg cgagcccgaa 5220 

ccggacgtag cagtgctcac ttccatgctc accgacccct cccacattac ggcggagacg 52 80 

gctaagcgta ggctggccag gggatctccc cccfcccttgg ccagctcatc agctatccag 5340 

ctgtctgcgc cttccttgaa ggcaacatgc actacccgtc atgactcccc ggacgctgac 5400 

ctcatcgagg ccaacctcct gtggcggcag gagatgggcg ggaacatcac ccgcgtggag 5460 

tcagaaaata aggtagtaat tttggagtct ttcgagccgc tccaagcgga ggaggatgag 5520 

agggaagtat ccgttccggc ggagatcctg cggaggtcca ggaaattccc tcgagcgatg 5580 

cccatatggg cactcccgga ttacaaccct ccactgttag agtcctggaa ggacccggac 5640 

tacgtccctc cagtggfcaca cgggtgtcca ttgccgcctg ccaaggcccc tccggtacca 5700 

cctccacgga ggaagaggac ggttgtcctg tcagaatcta ccgtgtcttc tgccttggcg 5760 

gagctcgcca caaagacctt cggcagctcc gaatcgtcgg ccgtcgacag cggcacggca 5820 

acggcctctc ctggtgagga cgtcgtctgc tgctcgatgt cctacacatg gacaggcgcc 5880 

ctgatcacgc catgcgctgc ggaggaaacc aagctgccca tcaatgcact gagcaactct 5940 

ttgctccgac accacaactt ggtctatgct acaacatcfcc gcagcgcaag cctgcggcag 6000 

aagaaggtca cctttgacag actgcaggtc ctggacgacc actaccggga cgtgctcaag 6060 

gagatgaagg cgaaggcgtc cacagttaag gctaaacttc tatccgtgga ggaagcctgt 612 0 

aagctgacgc ccccacattc ggccagatct aaatttggct atggggcaaa ggacgtccgg 6180 

aacctatcca gcaaggccgt taaccacatc cgctccgtgt ggaaggactt gctggaagac 6240 

actgagacac caattgacac caccatcatg gcaaaaaatg aggttttctg cgtccaacca 6300 

gagaaggggg gccgcaagcc agctcgcctt atcgtattcc cagatttggg ggttcgtgtg 6360 

tgcgagaaaa tggcccttta cgatgtggtc tccaccctcc ctcaggccgt gatgggctct 6420 

tcatacggat tccaatactc tcctggacag cgggtcgagt tcctggtgaa tgcctggaaa 6480 

gcgaagaaat gccctatggg cttcgcatat gacacccgct gttttgactc aacggtcact 6540 

gagaatgaca tccgtgttga ggagtcaatc taccaatgtt gtgacttggc ccccgaagcc 6600 

agacaggcca taaggtcgct cacagagcgg ctttacatcg ggggccccct gactaattct 6660 

aaagggcaga actgcggcta tcgccggtgc cgcgcgagcg gtgtactgac gaccagctgc 6720 

ggtaataccc tcacatgtta cttgaaggcc gctgcggcct gtcgagctgc gaagctccag 6780 

gactgcacga tgctcgtatg cggagacgac cttgtcgtta tctgtgaaag cgcggggacc 6840 

caagaggacg aggcgagcct acgggccttc acggaggcta tgactagata ctctgccccc 6900 

cctggggacc cgcccaaacc agaatacgac ttggagttga taacatcatg ctcctccaat 6960 

gtgtcagtcg cgcacgatgc atctggcaaa agggtgtact atctcacccg tgaccccacc 702 0 

accccccttg cgcgggctgc gtgggagaca gctagacaca ctccagtcaa ttcctggcta 7080 

ggcaacatca tcatgtatgc gcccaccttg tgggcaagga tgatcctgat gactcatttc 7140 

ttctccatcc ttctagctca ggaacaactt gaaaaagccc tagattgtca gatctacggg 7200 

gcctgttact ccattgagcc acttgaccta cctcagatca ttcaacgact ccatggcctt 7260 

agcgcatttt cactccatag ttactctcca ggtgagatca atagggtggc ttcatgcctc 7320 

aggaaacttg gggtaccgcc cttgcgagtc tggagacatc gggccagaag tgtccgcgct 7380 

aggctactgt cccagggggg gagggctgcc acttgtggca agtacctctt caactgggca 7440 

gtaaggacca agctcaaact cactccaatc ccggctgcgt cccagttgga tttatccagc 7500 

tggttcgttg ctggttacag cgggggagac atatatcaca gcctgtctcg tgcccgaccc 7560 

cgctggttca tgtggtgcct actcctactt tctgtagggg taggcatcta tctactcccc 7620 

aaccgatgaa cggggaccta aacactccag gccaataggc catcctgttt ttttcccttt 7680 

ttttttttct tttttttttt tttttttttt tttttttttt tttctccttt ttttttcctc 7740 

tttttttcct tttctttcct ttggtggctc catcttagcc ctagtcacgg ctagctgtga 7800 

aaggtccgtg agccgcttga ctgcagagag tgctgatact ggcctctctg cagatcaagt 7860 
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actcctgcag gcgcgccact agtgggaata cgcggggtat gccgcgtttt agcatattga 7920 

cgacccaatt ctcatgtttg acagcttatc atcgataagc tttaatgcgg tagtttatca 7980 

cagttaaatt gctaacgcag tcaggcaccg tgtatgaaat ctaacaatgc gctcatcgtc 8040 

atcctcggca ccgtcaccct ggatgctgta ggcataggct tggttatgcc ggtactgccg 8100 

ggcctcttgc gggatatcgt ccattccgac agcatcgcca gtcactatgg cgtgctgcta 8160 

gcgctatatg cgttgatgca atttctatgc gcacccgttc tcggagcact gtccgaccgc 8220 

tttggccgcc gcccagtcct gctcgcttcg ctacttggag ccactatcga ctacgcgatc 8280 

atggcgacca cacccgtcct gtggatcctc tacgccggac gcatcgtggc cggcatcacc 8340 

ggcgccacag gtgcggttgc tggcgcctat atcgccgaca tcaccgatgg ggaagatcgg 8400 

gctcgccact tcgggctcat gagcgcttgt ttcggcgtgg gtatggtggc aggccccgtg 8460 

gccgggggac tgttgggcgc catctccttg catgcaccat tccttgcggc ggcggtgctc 852 0 

aacggcctca acctactact gggctgcttc ctaatgcagg agtcgcataa gggagagcgt 8580 

cgaccgatgc ccttgagagc cttcaaccca gtcagctcct tccggtgggc gcggggcatg 8640 

actatcgtcg ccgcacttat gactgtcttc tttatcatgc aactcgtagg acaggtgccg 8700 

gcagcgctct gggtcatttt cggcgaggac cgctttcgct ggagcgcgac gafcgatcggc 8750 

ctgtcgcttg cggtattcgg aatcttgcac gccctcgctc aagccttcgt cactggtccc 8 820 

gccaccaaac gtttcggcga gaagcaggcc attatcgccg gcatggcggc cgacgcgctg 8 880 

ggctacgtct tgctggcgtt cgcgacgcga ggctggatgg ccttccccat tatgattctt 8940 

ctcgcttccg gcggcatcgg gatgcccgcg ttgcaggcca tgctgtccag gcaggtagat 9000 

gacgaccatc agggacagct tcaaggatcg ctcgcggctc ttaccagcct aacttcgatc 9060 

actggaccgc tgatcgtcac ggcgatttat gccgcctcgg cgagcacatg gaacgggttg 9120 

gcatggattg taggcgccgc cctatacctt gtctgcctcc ccgcgttgcg tcgcggtgca 9180 

tggagccggg ccacctcgac ctgaatggaa gccggcggca cctcgctaac ggattcacca 9240 

ctccaagaat tggagccaat caattcttgc ggagaactgt gaatgcgcaa accaaccctt 93 00 

ggcagaacat atccatcgcg tccgccatct ccagcagccg cacgcggcgc atctcgggca 9360 

gcgttgggtc ctggccacgg gtgcgcatga tcgtgctcct gtcgttgagg acccggctag 9420 

gctggcgggg ttgccttact ggttagcaga atgaatcacc gatacgcgag cgaacgtgaa 9480 

gcgactgctg ctgcaaaacg tctgcgacct gagcaacaac atgaatggtc ttcggtttcc 9540 

gtgtttcgta aagtctggaa acgcggaagt cagcgccctg caccattatg ttccggatct 9600 

gcatcgcagg atgctgctgg ctaccctgtg gaacacctac atctgtatta acgaagcgct 9660 

ggcattgacc ctgagtgatt tttctctggt cccgccgcat ccataccgcc agttgtttac 9720 

cctcacaacg ttccagtaac cgggcatgtt catcatcagt aacccgtatc gtgagcatcc 9780 

tctctcgttt catcggtatc attaccccca tgaacagaaa ttccccctta cacggaggca 9840 

tcaagtgacc aaacaggaaa aaaccgccct taacatggcc cgctttatca gaagccagac 9900 

attaacgctt ctggagaaac tcaacgagct ggacgcggat gaacaggcag acatctgtga 9960 

atcgcttcac gaccacgctg atgagcttta ccgcagctgc ctcgcgcgtt tcggtgatga 10020 

cggtgaaaac ctctgacaca tgcagctccc ggagacggtc acagcttgtc tgtaagcgga 10080 

tgccgggagc agacaagccc gtcagggcgc gtcagcgggt gttggcgggt gtcggggcgc 10140 

agccatgacc cagtcacgta gcgatagcgg agtgtatact ggcttaacta tgcggcatca 10200 

gagaagattg tactgagagt gcaccatatg cggtgtgaaa taccgcacag atgcgtaagg 10260 

agaaaatacc gcatcaggcg ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 1032 0 

gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 103 80 

tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 10440 

aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 10500 

aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 10560 

ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 1062 0 

tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 10680 

agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 10740 

gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 10800 

tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 10850 

acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt atttggtatc ' 10920 

tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 10980 

caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 11040 

aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 11100 

aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 11160 

ttctagataa tacgactcac tata 11184 
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